Raytracing On OpenGL Compute Shader

Compute Shader GLSL Variables

uvec3 gl_NumWorkGroups    global work group size we gave to glDispatchCompute()
uvec3 gl_WorkGroupSize    local work group size we defined with layout
uvec3 gl_WorkGroupID    position of current invocation in global work group
uvec3 gl_LocalInvocationID    position of current invocation in local work group
uvec3 gl_GlobalInvocationID    unique index of current invocation in global work group
uint gl_LocalInvocationIndex    1d index representation of gl_LocalInvocationID

Execution:

执行渲染是：一个texture到full-screen quad，当然是要用个矩形绘制填充NDC

Creating Texture/Image创建纹理：

创建32位图，最后一句话 OpenGL treats "image units" slightly differently to textures, so we call a glBindImageTexture() function to make this link. Note that we can set this to "write only".

这个贴图单元与普通得textures稍微不一样，用最后一句函数可以让图片写入。

// dimensions of the image
int tex_w = 512, tex_h = 512;
GLuint tex_output;
glGenTextures(1, &tex_output);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, tex_output);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, tex_w, tex_h, 0, GL_RGBA, GL_FLOAT,
 NULL);
glBindImageTexture(0, tex_output, 0, GL_FALSE, 0, GL_WRITE_ONLY, GL_RGBA32F);

Determining the Work Group Size线程开辟

glDispatchCompute()函数可以决定我们在compute shader invocations定义计算量。首先得到最大size of the total work group

通过以下函数得到在x,y,z上 total work group 范围:

int work_grp_cnt[3];

glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_COUNT, 0, &work_grp_cnt[0]);
glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_COUNT, 1, &work_grp_cnt[1]);
glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_COUNT, 2, &work_grp_cnt[2]);

printf("max global (total) work group counts x:%i y:%i z:%i
",
  work_grp_cnt[0], work_grp_cnt[1], work_grp_cnt[2]);

得到最大支持 local work group 大小(sub-division of the total number of jobs总任务局部细分),这个是着色器内部定义的，用关键字layout。

int work_grp_size[3];

glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_SIZE, 0, &work_grp_size[0]);
glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_SIZE, 1, &work_grp_size[1]);
glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_SIZE, 2, &work_grp_size[2]);

printf("max local (in one shader) work group sizes x:%i y:%i z:%i
",
  work_grp_size[0], work_grp_size[1], work_grp_size[2]);

更进一步：在compute shader的local work group 最大工作单元(work group units)是如果我们如果在one local work group 执行32X32的工作任务，意味着不能超过这个32*32 = 1024这个值。

local work group size是根据设备来的。用合理的限制，让用户合理的调整 local work group 的大小可以获取更好的性能。

我的本机输出：

max global (total) work group size x:2147483647 y:65535 z:65535
max local (in one shader) work group sizes x:1024 y:1024 z:64
max computer shader invocations 1024

从简单的设置开始：

------把全局工作组大小(Global work group size) 设置与贴图一样大512*512

------局部工作组大小(Local work group size) 设置成1个像素 1*1

------设置Z大小为1

编写最基础的Compute shader

#version 450
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;

第一行的layout是定义local work group 大小。这个是后台决定，如果要调整 local work group 更大点的话，我们不用改材质。

这里我们决定用1个像素用1*1，如果工作组要支持 1d or 3d 需要改结构。

第二行的layout是图片设置，注意不是uniform sampler XXX，而是 uniform image2D XXX

现在开始设置一个黑色shader:

void main() {
  // base pixel colour for image
  vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
  // get index in global work group i.e x,y position
  ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);
  
  //
  // interesting stuff happens here later

  //
  // output to a specific pixel in the image
  imageStore(img_output, pixel_coords, pixel);
}

第二行坐标是用:GLSL内建变量 gl_GlobalInvocationID.xy定位调用在工作组的坐标。

编译材质用这个类型：GL_COMPUTE_SHADER：

GLuint ray_shader = glCreateShader(GL_COMPUTE_SHADER);
glShaderSource(ray_shader, 1, &the_ray_shader_string, NULL);
glCompileShader(ray_shader);
// check for compilation errors as per normal here

GLuint ray_program = glCreateProgram();
glAttachShader(ray_program, ray_shader);
glLinkProgram(ray_program);
// check for linking errors and validate program as per normal here

我们当然还要渲染图片到最后的quad上，这样就可以读取我们这个 image2d贴图

Dispatch the shaders 渲染循环中执行材质:

第一步先tm的渲染compute shader texture，把z轴设置为1：

glUseProgram(ray_program);
glDispatchCompute((GLuint)tex_w, (GLuint)tex_h, 1);

// make sure writing to image has finished before read
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);

To make sure that the compute shaders have completely finished writing to the image before we start sampling, we put in a memory barrier with glMemoryBarrier() and the image access bit 。

为了保证图片完成之后采样，所以用了个glMemoryBarrier(),也可以You can instead use GL_ALL_BARRIER_BITS to be on the safe side for all types of writing

第二步，正常绘制到quad上：

// normal drawing pass
glClear(GL_COLOR_BUFFER_BIT);
glUseProgram(quad_program);
glBindVertexArray(quad_vao);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, tex_output);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

我把贴图绘制设置成绿色：

NEXT:Raytracing

一开始是将scene直接hard-code模式写入我们compute shader。

Ray定义：origin和direction

我们想分布ray's origin 到我们像素，作者定义为NDC:-5.0到5.0 X和Y都在这个区间

float max_x = 5.0;
float max_y = 5.0;
ivec2 dims = imageSize(img_output); // fetch image dimensions
float x = (float(pixel_coords.x * 2 - dims.x) / dims.x);
float y = (float(pixel_coords.y * 2 - dims.y) / dims.y);
vec3 ray_o = vec3(x * max_x, y * max_y, 0.0);
vec3 ray_d = vec3(0.0, 0.0, -1.0); // ortho

在这里我不映射 -5到5 ，直接就按照一个像素 1 的大小。整个图片中心为(0，0）点。然后每个像素的中心为光线中心。

所以应该应用:

#version 450 core
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;

void main() {
    const float pixel_size = 1.0f;
    ivec2 texsize = imageSize(img_output);  // get current texture size , 500 * 500

    // base pixel colour for image
    vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
    // get index in global work group i.e x,y position
    ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);
    pixel.r = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0));
    pixel.g = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0));
    // output to a specific pixel in the image
    imageStore(img_output, pixel_coords, pixel);
}

此时范围在xy属于-250 到 249。如果在这个基础上做多重抖动采样可以刚好满足 -250 到 250：

抖动采样伪代码：

 // for samples
            for (int i = 0; i < vp.num_samples; i++) {
                sample_point = vp.sampler_ptr->sample_unit_square();
                pixel_pos[0] = vp.size * (c - 0.5*vp.hres + sample_point[0]);
                pixel_pos[1] = vp.size * (r - 0.5*vp.vres + sample_point[1]);
                ray.o = RT_VEC_3D({ pixel_pos[0],pixel_pos[1],RT_SCALAR(zw) });
                pixel_color += tracer_ptr->trace_ray(ray);
            }
            pixel_color /= vp.num_samples;

View Code

对比houdini中的测试:

Raytracing a sphere:

#version 450 core
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;

const float k=0.000001;

struct Ray{
    vec3 o;
    vec3 d;
};

struct Sphere{
    vec3 c;
    float r;
};

bool intersectSphere(Ray ray, Sphere sphere){
    float t;
    float tmin = 0.0f;
    float A = dot(ray.d,ray.d);
    float B = 2* dot(ray.o-sphere.c,ray.d);
    float C = dot(ray.o-sphere.c,ray.o-sphere.c) - sphere.r * sphere.r;
    float disc = B * B - 4.0f * A * C;
    if(disc <0.0 ){
        return false;
    }
    else{
        float e = sqrt(disc);
        float denom = 2.0 * A;
        t = (-B -e) / denom; // smaller root
        if(t > k){
            tmin = t;
            return true;
        }

        t = (-B + e) / denom; // large root
        if(t > k){
            tmin = t;
            return true;
        }
    }
    return false;
}


void main() {

    const float pixel_size = 1.0f;
    ivec2 texsize = imageSize(img_output);  // get current texture size , 500 * 500

    // base pixel colour for image
    vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
    // get index in global work group i.e x,y position
    ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);

    //
    // interesting stuff happens here later
    //


    float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0f));
    float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0f));

    //
    Ray ray; // ortho
    ray.o = vec3(x,y,0.0f);
    ray.d = vec3(0.0f, 0.0f, -1.0f);

    Sphere sphere1;
    sphere1.c = vec3(150,0.0,-200);
    sphere1.r = 100.0f;


    if(intersectSphere(ray,sphere1)){
        pixel.r = 1.0f;
    }


    // output to a specific pixel in the image
    imageStore(img_output, pixel_coords, pixel);
}

compute shader

#define GLEW_STATIC
// GLEW
#include <GL/glew.h>
#include <cstdlib>
#undef GLFW_DLL
// GLFW
#include <GLFW/glfw3.h>
#include <iostream>
#include "ALG_GLFWCamera.h"
#include "ALG_FrameWindow.h"
#include "ALG_DrawPostPocessingQuad.h"
#include "ALG_OGLHelper.h"
#include <cmath>
#include <glm/glm.hpp>
#include <glm/gtc/matrix_transform.hpp>
#include <glm/gtc/type_ptr.hpp>
#include "ALG_FPS.h"
#include "ALG_LoadComputeShader.h"


using namespace AlgebraMaster;

const unsigned int SRC_WIDTH = 500;
const unsigned int SRC_HEIGHT = 500;

void init();
void display();


// camera
static GLFWCamera *camera;
static float lastX =  float(SRC_WIDTH) / 2.0f;
static float lastY =  float(SRC_HEIGHT) / 2.0f;
static bool firstMouse = true;
static bool firstMiddowMouse = true;
// timing
static float deltaTime = 0.0f;    // time between current frame and last frame
static float lastFrame = 0.0f;



static FrameWindow *frameWindow ;
static DrawPostProcessingQuad<GL_RGBA32F> quad;
static LoadComputeShade computeShade;


GLuint rayTexture;

void init(){

    camera = new GLFWCamera;
    if(!CheckExtension("GL_ARB_shading_language_include")){
        cout << "---------------ERROR:: SHADER DO NOT SUPPORT INCLUDE SHADER----------------------------------
";
    }
    AddCommonShaderFile("shaders/common/material_interface.glsl");
    AddCommonShaderFile("shaders/common/light_interface.glsl");
    AddCommonShaderFile("shaders/common/shadow.glsl");
    AddCommonShaderFile("shaders/common/utils.glsl");
    AddCommonShaderFile("shaders/common/postprocess.glsl");

    // Dump OGL infos
    int work_grp_size[3], work_grp_inv;
    // maximum global work group (total work in a dispatch)
    glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_COUNT, 0, &work_grp_size[0] );
    glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_COUNT, 1, &work_grp_size[1] );
    glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_COUNT, 2, &work_grp_size[2] );
    printf( "max global (total) work group size x:%i y:%i z:%i
", work_grp_size[0],
            work_grp_size[1], work_grp_size[2] );
    // maximum local work group (one shader's slice)
    glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_SIZE, 0, &work_grp_size[0] );
    glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_SIZE, 1, &work_grp_size[1] );
    glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_SIZE, 2, &work_grp_size[2] );
    printf( "max local (in one shader) work group sizes x:%i y:%i z:%i
",
            work_grp_size[0], work_grp_size[1], work_grp_size[2] );
    // maximum compute shader invocations (x * y * z)
    glGetIntegerv( GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, &work_grp_inv );
    printf( "max computer shader invocations %i
", work_grp_inv );


    // create texture
    glCreateTextures(GL_TEXTURE_2D,1,&rayTexture);
    glBindTexture(GL_TEXTURE_2D,rayTexture);
    glTextureParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTextureParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
    glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR );
    // same internal format as compute shader input
    glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA32F, SRC_WIDTH, SRC_HEIGHT, 0, GL_RGBA, GL_FLOAT,
                  NULL );
    // bind to image unit so can write to specific pixels from the shader
    glBindImageTexture( 0, rayTexture, 0, GL_FALSE, 0, GL_WRITE_ONLY, GL_RGBA32F );



    // load compute shader
    computeShade.load("shaders/raytracing/constant_coulour.glsl");
    // initialize our quad geo
    quad.initialize();
    quad.setupWidthHeight(SRC_WIDTH,SRC_HEIGHT);

}




// ----------- Render Loop ----------
void display(){

    int display_w, display_h;
    glfwGetFramebufferSize(frameWindow->getWindow(), &display_w, &display_h);
    computeShade.use();
    glDispatchCompute( (GLuint)SRC_WIDTH, (GLuint)SRC_HEIGHT, 1 ); // keep z as 1
    // prevent sampling befor all writes to image are done
    glMemoryBarrier( GL_SHADER_IMAGE_ACCESS_BARRIER_BIT );


    ClearAllBufferColor();
    glBindTexture(GL_TEXTURE_2D,rayTexture);
    quad.draw();

}





void processInput(GLFWwindow *window);
void framebuffer_size_callback(GLFWwindow* window, int width, int height); // framezize
void mouse_callback(GLFWwindow* window, double xpos, double ypos); // Maya Alt+LeftMouse
void scroll_callback(GLFWwindow *window, double xoffset, double yoffset);

int main()
{

    glfwInit();
    frameWindow = new FrameWindow(SRC_WIDTH,SRC_HEIGHT);
    glfwSetFramebufferSizeCallback(frameWindow->getWindow(), framebuffer_size_callback);
    glfwSetCursorPosCallback(frameWindow->getWindow(),mouse_callback);
    glfwSetScrollCallback(frameWindow->getWindow(), scroll_callback);
    glewInit();
    init();

    //double lastTime  = glfwGetTime();
    //double targetFps = 60.0f;

    FPSLimit fpsLimit;
    fpsLimit.targetFps = 60.02f;
    FPSGet fpsGet;



    while(!glfwWindowShouldClose(frameWindow->getWindow())){
        processInput(frameWindow->getWindow());
        fpsLimit.limitFPS();
        fpsGet.updateFPS(frameWindow->getWindow());


        // Rendering objs Loop
        display();


        glfwSwapBuffers(frameWindow->getWindow());
        glfwPollEvents();

    }


    delete camera;
    delete frameWindow;
    return 0;
}

void framebuffer_size_callback(GLFWwindow* window, int width, int height)
{
    // make sure the viewport matches the new window dimensions; note that width and
    // height will be significantly larger than specified on retina displays.
    glViewport(0, 0, width, height);

}

void processInput(GLFWwindow *window)
{
    if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS)
        glfwSetWindowShouldClose(window, true);
    if (glfwGetKey(window, GLFW_KEY_W) == GLFW_PRESS)
        camera->processKeyboardMove(deltaTime,GLFWCamera::FORWARD);
    if (glfwGetKey(window, GLFW_KEY_S) == GLFW_PRESS)
        camera->processKeyboardMove(deltaTime,GLFWCamera::BACKWARD);
    if (glfwGetKey(window, GLFW_KEY_A) == GLFW_PRESS)
        camera->processKeyboardMove(deltaTime,GLFWCamera::LEFT);
    if (glfwGetKey(window, GLFW_KEY_D) == GLFW_PRESS)
        camera->processKeyboardMove(deltaTime,GLFWCamera::RIGHT);
}

// ROTATE VIEW DIR
void mouse_callback(GLFWwindow* window, double xpos, double ypos){

    int middow_mouse_state = glfwGetMouseButton(window,GLFW_MOUSE_BUTTON_MIDDLE);
    int mouse_state = glfwGetMouseButton(window,GLFW_MOUSE_BUTTON_LEFT);
    int key_state = glfwGetKey(window,GLFW_KEY_LEFT_ALT);
    // set up the camera view
    if( mouse_state == GLFW_PRESS && key_state== GLFW_PRESS)
    {
        if (firstMouse){
            lastX = xpos;
            lastY = ypos;
            firstMouse = false;
        }
        float xoffset = xpos - lastX;
        float yoffset = lastY - ypos; // reversed since y-coordinates go from bottom to top
        lastX = xpos;
        lastY = ypos;
        camera->processMouseMove(xoffset,yoffset);
    }
    if (key_state == GLFW_RELEASE || mouse_state == GLFW_RELEASE){
        firstMouse = true;
    }


    // Move Camera Position
    if( middow_mouse_state == GLFW_PRESS) {

        if (firstMiddowMouse){
            lastX = xpos;
            lastY = ypos;
            firstMiddowMouse = false;
        }
        float xoffset = xpos - lastX;
        float yoffset = lastY - ypos; // reversed since y-coordinates go from bottom to top
        lastX = xpos;
        lastY = ypos;
        camera->pos.x += xoffset*0.01f;
        camera->pos.y += yoffset*0.01f;

    }
    if ( middow_mouse_state == GLFW_RELEASE){
        firstMiddowMouse = true;
    }

}

void scroll_callback(GLFWwindow *window, double xoffset, double yoffset){
    camera->processFov(yoffset);
}

main.cpp

#ifndef ALG_COMPUTESHADERLOADER_H
#define ALG_COMPUTESHADERLOADER_H
#include <string>

#define GLEW_STATIC
// GLEW
#include <GL/glew.h>

namespace AlgebraMaster {
using namespace std;
struct LoadComputeShade{
        void fromSrc(const string &vertCode);
        void load(const string &filePath);
        void compileShadeProgram();
        LoadComputeShade();
        ~LoadComputeShade();
        GLuint rayShader;
        GLuint rayProgram;
        inline void use(){glUseProgram(rayProgram);}
};

}


#endif // ALG_COMPUTESHADERLOADER_H

ALG_LoadComputeShader.h

#include "ALG_LoadComputeShader.h"
#include <fstream>
#include <sstream>
#include <iostream>
#include "ALG_CheckError.h"
namespace AlgebraMaster{

LoadComputeShade::LoadComputeShade(){}
LoadComputeShade::~LoadComputeShade(){}

void LoadComputeShade::load(const string&filePath){
    ifstream stream;
    stringstream ss;
    stream.exceptions(ifstream::badbit);
    try
    {
        stream.open(filePath);    // open file
        ss << stream.rdbuf(); // get strings from file
    } catch (ifstream::failure e)
    {
        cout << "ERROR::OPEN FILE:" << filePath << endl;
    }
    // close file handle
    stream.close();

    // get str() from stringstream
    string shaderCode = ss.str();

    fromSrc(shaderCode);


}

void LoadComputeShade::fromSrc(const string &vertCode){
    rayShader = glCreateShader(GL_COMPUTE_SHADER);
    const char * src = vertCode.c_str();

    // add common shader
    glShaderSource(rayShader, 1, &(src), NULL );

    // ----add common shader include path------
    const char* const searchPath = "shaders/common/";
    glCompileShaderIncludeARB(rayShader, 1, &searchPath, nullptr);

    // compile shader
    glCompileShader(rayShader );
    CheckShader(rayShader);
    // compile our program
    compileShadeProgram();
}


void LoadComputeShade::compileShadeProgram(){
    rayProgram = glCreateProgram();
    glAttachShader(rayProgram, rayShader );
    glLinkProgram(rayProgram );
    CheckShadeProgramLink(rayProgram);
    // Delete the allShaders as they're linked into our program now and no longer necessery
    glDeleteShader(rayShader );
}




}// end of namespace

ALG_LoadComput.cpp

ray dir light with plane:

#version 450 core
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;

const float k=0.000001;

struct Ray{
    vec3 o;
    vec3 d;
};

struct Sphere{
    vec3 c;
    float r;
};


struct Plane{
    vec3 P;
    vec3 N;
};

struct RayHit{
    vec3 hitP;
    vec3 hitN;
    bool state;
    float t;       // o + td
};


RayHit intersectSphere(Ray ray, Sphere sphere){
    float t;
    float tmin = 111111110.0f;
    vec3 temp = ray.o - sphere.c;
    float A = dot(ray.d,ray.d);
    float B = 2 * dot(temp, ray.d);
    float C = dot(ray.o-sphere.c,ray.o-sphere.c) - sphere.r * sphere.r;
    float disc = B * B - 4.0f * A * C;
    RayHit hit;
    hit.state = false; // default = false

    if(disc <0.0 )
        hit.state = false;
    else{
        float e = sqrt(disc);
        float denom = 2.0 * A;
        t = (-B -e) / denom; // smaller root
        if(t > k){
            tmin = t;
            hit.state = true;
            hit.t = t;
            hit.hitP = ray.o + t*ray.d;
            hit.hitN = normalize(hit.hitP - sphere.c);
            return hit;
        }

        t = (-B + e) / denom; // large root
        if(t > k){
            tmin = t;
            hit.state = true;
            hit.t = t;
            hit.hitP = ray.o + t*ray.d;
            hit.hitN = normalize(hit.hitP - sphere.c);
            return hit;
        }
    }
    return hit;
}


RayHit intersectPlane(Ray ray, Plane plane)
{
    float t = dot( plane.P - ray.o, plane.N) / dot(ray.d , plane.N);
    RayHit hit;
    hit.state = false;

    if(t>k){
        hit.state = true;
        hit.hitP = ray.o + t * ray.d;
        hit.hitN = plane.N;
        return hit;
    }else{
        hit.state =false;
        return hit;
    }
}





void main() {

    const float pixel_size = 1.0f;
    ivec2 texsize = imageSize(img_output);  // get current texture size , 500 * 500

    // base pixel colour for image
    vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
    // get index in global work group i.e x,y position
    ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);

    //
    // interesting stuff happens here later
    //


    float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0f));
    float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0f));

    //
    Ray ray; // ortho
    ray.o = vec3(x,y,0.0f);
    ray.d = vec3(0.0f, 0.0f, -1.0f);

    // scene objs
    Sphere sphere1;
    sphere1.c = vec3(0,0.0,-200);
    sphere1.r = 100.0f;
    Plane plane= {vec3(0,0,0), vec3(0,1,0)};

    vec3 lgtdir = normalize(vec3(1,1,0));

    RayHit hit;
    hit = intersectPlane(ray,plane);
    if(hit.state){
        float radiance = dot(lgtdir,hit.hitN);
        pixel = vec4(vec3(radiance),1.0f);
    }
     hit = intersectSphere(ray,sphere1);
    if(hit.state){
        float radiance = dot(lgtdir,hit.hitN);
        pixel = vec4(vec3(radiance),1.0f);
    }



    // output to a specific pixel in the image
    imageStore(img_output, pixel_coords, pixel);
}

shader

Ray hard-code scene:

这里改成了-1到-1 范围，可以让场景范围适当改小:

#version 450 core
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;

struct Ray{
    vec3 o;
    vec3 d;
};

struct Sphere{
    vec3 c;
    float r;
};
struct Plane{
    vec3 P; // cross this plane
    vec3 N;

};


struct RayHit{
    vec3 hitP;
    vec3 hitN;
    float t;// o + td
};

float random (vec2 st) {
    return fract(sin(dot(st.xy,
    vec2(12.9898,78.233)))*
    43758.5453123);
}
vec2 random2(vec2 st){
    float x = random(st);
    float y = random(st*4358.954 + vec2(213.031,823.113));
    return vec2(x,y);
}

struct Scene{
    Sphere sphere;
    Plane plane;
};






bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){
    float t;
    vec3 temp = ray.o - sphere.c;
    float A = dot(ray.d, ray.d);
    float B = 2 * dot(temp, ray.d);
    float C = dot(ray.o-sphere.c, ray.o-sphere.c) - sphere.r * sphere.r;
    float disc = B * B - 4.0f * A * C;


    if (disc <0.0)
    return false;
    else {
        float e = sqrt(disc);
        float denom = 2.0 * A;
        t = (-B -e) / denom;// smaller root
        if (t > tmin && t< tmax){
            hitRecord.t = t;
            hitRecord.hitP = ray.o + t*ray.d;
            hitRecord.hitN = normalize(hitRecord.hitP - sphere.c);
            return true;
        }
        t = (-B + e) / denom;// large root
        if (t > tmin && t<tmax){
            hitRecord.t = t;
            hitRecord.hitP = ray.o + t*ray.d;
            hitRecord.hitN = normalize(hitRecord.hitP - sphere.c);
            return true;
        }
    }
    return false;
}

bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){
    vec3 n = normalize(plane.N);
    double denom = dot(ray.d,n);
    double t = dot((plane.P - ray.o) , n) / denom;
    if(t > tmin && t< tmax){
        hitRecord.t = float(t);
        hitRecord.hitP = ray.o + float(t)*ray.d;
        hitRecord.hitN = plane.N;
        return true;
    }
    return false;
}


bool worldHit(inout Ray ray,  inout Scene scene , float tmin , float tmax, inout RayHit rec){
    bool hit_anyting = false;
    float close_so_far = tmax;


    RayHit temp_rec;
    if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){
        hit_anyting = true;
        close_so_far = temp_rec.t;
        rec = temp_rec;
    }
    if(raySphere(ray,scene.sphere,tmin,close_so_far,temp_rec)){
        hit_anyting = true;
        close_so_far = temp_rec.t;
        rec = temp_rec;
    }

    return hit_anyting;
}

vec3 color(inout Ray ray, inout Scene scene){
    float tmax = 10000000; // default very far scene


    RayHit record; // Hit record


    if(worldHit(ray,scene,0.0, tmax,record )){

        return record.hitN;
    }
    return vec3(0.0);
}





int samples = 1;

void main() {

    const float pixel_size = 1.0f;
    ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500

    // base pixel colour for image
    vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
    // get index in global work group i.e x,y position
    ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);

    //
    // interesting stuff happens here later
    //



    // scene objs
    Sphere sphere1;
    sphere1.c = vec3(0, 0, -1.0);
    sphere1.r = 0.5;

    Plane plane1;
    plane1.P = vec3(0,-0.49,0);
    plane1.N = vec3(0,1,0);

    // Keep 1:1 aspect , that is a square image
    float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0f));
    float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0f));
    x /= float(texsize.x/2);
    y /= float(texsize.y/2);

    // define the ray
    Ray ray;
    ray.o = vec3(0, 0, 10.0f);               // o as the camera pos
    ray.d = vec3(x,y,0) - ray.o;             // d is viewplane.pos - camera.pos
    ray.d = normalize(ray.d);                //

    Scene scene;
    scene.plane = plane1;
    scene.sphere = sphere1;


    pixel.rgb = color(ray, scene);


    // output to a specific pixel in the image
    imageStore(img_output, pixel_coords, pixel);
}

View Code

抗锯齿采样：用的最垃圾随机采样

构建一个宽度为主的画面，比如Raytracing from one weekend是2:1的画面比例。

#version 450 core
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;

struct Ray{
    vec3 o;
    vec3 d;
};

struct Sphere{
    vec3 c;
    float r;
};
struct Plane{
    vec3 P; // cross this plane
    vec3 N;

};

struct RayHit{
    vec3 hitP;
    vec3 hitN;
    float t;// o + td
};


struct Scene{
    Sphere sphere;
    Plane plane;
};






float rand(float n){return fract(sin(n) * 43758.5453123);}


bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){
    float t;
    vec3 temp = ray.o - sphere.c;
    float A = dot(ray.d, ray.d);
    float B = 2 * dot(temp, ray.d);
    float C = dot(ray.o-sphere.c, ray.o-sphere.c) - sphere.r * sphere.r;
    float disc = B * B - 4.0f * A * C;


    if (disc <0.0)
    return false;
    else {
        float e = sqrt(disc);
        float denom = 2.0 * A;
        t = (-B -e) / denom;// smaller root
        if (t > tmin && t< tmax){
            hitRecord.t = t;
            hitRecord.hitP = ray.o + t*ray.d;
            hitRecord.hitN = normalize(hitRecord.hitP - sphere.c);
            return true;
        }
        t = (-B + e) / denom;// large root
        if (t > tmin && t<tmax){
            hitRecord.t = t;
            hitRecord.hitP = ray.o + t*ray.d;
            hitRecord.hitN = normalize(hitRecord.hitP - sphere.c);
            return true;
        }
    }
    return false;
}

bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){
    vec3 n = normalize(plane.N);
    double denom = dot(ray.d,n);
    double t = dot((plane.P - ray.o) , n) / denom;
    if(t > tmin && t< tmax){
        hitRecord.t = float(t);
        hitRecord.hitP = ray.o + float(t)*ray.d;
        hitRecord.hitN = plane.N;
        return true;
    }
    return false;
}


bool worldHit(inout Ray ray,  inout Scene scene , float tmin , float tmax, inout RayHit rec){
    bool hit_anyting = false;
    float close_so_far = tmax;


    RayHit temp_rec;
    if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){
        hit_anyting = true;
        close_so_far = temp_rec.t;
        rec = temp_rec;
    }
    if(raySphere(ray,scene.sphere,tmin,close_so_far,temp_rec)){
        hit_anyting = true;
        close_so_far = temp_rec.t;
        rec = temp_rec;
    }

    return hit_anyting;
}

vec3 color(inout Ray ray, inout Scene scene){
    float tmax = 10000000; // default very far scene
    RayHit record; // Hit record
    if(worldHit(ray,scene,0.0, tmax,record )){

        return record.hitN;
    }
    else
    {
        vec3 unit_direction = ray.d;
        float t = 0.5 *(unit_direction.y + 1.0);
        return (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0);
    }


}


float rand(vec2 n) {
    return fract(sin(dot(n, vec2(12.9898, 4.1414))) * 43758.5453);
}


float PHI = 1.61803398874989484820459 * 00000.1; // Golden Ratio
float PI  = 3.14159265358979323846264 * 00000.1; // PI
float SQ2 = 1.41421356237309504880169 * 10000.0; // Square Root of Two

float gold_noise(in vec2 coordinate, in float seed){
    return fract(tan(distance(coordinate*(seed+PHI), vec2(PHI, PI)))*SQ2);
}


int MIN = -2147483648;
int MAX = 2147483647;



// A single iteration of Bob Jenkins' One-At-A-Time hashing algorithm.
uint hash( uint x ) {
    x += ( x << 10u );
    x ^= ( x >>  6u );
    x += ( x <<  3u );
    x ^= ( x >> 11u );
    x += ( x << 15u );
    return x;
}



// Compound versions of the hashing algorithm I whipped together.
uint hash( uvec2 v ) { return hash( v.x ^ hash(v.y)                         ); }
uint hash( uvec3 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z)             ); }
uint hash( uvec4 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w) ); }



// Construct a float with half-open range [0:1] using low 23 bits.
// All zeroes yields 0.0, all ones yields the next smallest representable value below 1.0.
float floatConstruct( uint m ) {
    const uint ieeeMantissa = 0x007FFFFFu; // binary32 mantissa bitmask
    const uint ieeeOne      = 0x3F800000u; // 1.0 in IEEE binary32

    m &= ieeeMantissa;                     // Keep only mantissa bits (fractional part)
    m |= ieeeOne;                          // Add fractional part to 1.0

    float  f = uintBitsToFloat( m );       // Range [1:2]
    return f - 1.0;                        // Range [0:1]
}



// Pseudo-random value in half-open range [0:1].
float random( float x ) { return floatConstruct(hash(floatBitsToUint(x))); }
float random( vec2  v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec3  v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec4  v ) { return floatConstruct(hash(floatBitsToUint(v))); }


void main() {

    const float pixel_size = 1.0f;
    ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500

    // base pixel colour for image
    vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
    // get index in global work group i.e x,y position
    ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);

    //
    // interesting stuff happens here later
    //



    // scene objs
    Sphere sphere1;
    sphere1.c = vec3(0, 0, -1.0);
    sphere1.r = 0.5;

    Plane plane1;
    plane1.P = vec3(0,-0.49,0);
    plane1.N = vec3(0,1,0);


    Scene scene;
    scene.plane = plane1;
    scene.sphere = sphere1;


    int samples = 10;

    for(int i=0;i<samples;i++){
        // Keep 1:1 aspect , that is a square image

        float ox = 0.0f;
        float oy = 0.0f;

        if(false){
            ox = gold_noise(pixel_coords,0+i);
            oy = gold_noise(pixel_coords,1+i);
            ox = clamp(ox,0,1);
            oy = clamp(oy,0,1);
        }
        else{
            // ---------------- GEN unit-square sampers --------------
            float seedx = (float(i) + 1.0f) * 231;
            float seedy= (float(i) + 5000.0f) * 2311;
            ox = random(seedx);
            oy = random(seedy);
        }



        float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x -1.0f ) + ox );
        float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y -1.0f ) + oy );

        float aspect = float(texsize.y) / float(texsize.x);

        x /= float(texsize.x)/2.0f;
        y /= float(texsize.y)/2.0f;

        x /= aspect;

        // define the ray
        Ray ray;
        ray.o = vec3(0, 0, 10.0f);               // o as the camera pos
        ray.d = vec3(x,y,0) - ray.o;             // d is viewplane.pos - camera.pos
        ray.d = normalize(ray.d);                //

        pixel.rgb += color(ray, scene);

    }
    pixel.rgb /= float(samples);

    // output to a specific pixel in the image
    imageStore(img_output, pixel_coords, pixel);
}

View Code

环境遮挡：

把球位移到法线中心，也就是球中心为: hitP + hitN ,那么:

ray.o = hitP;
ray.d = hitP+hitN + unit_sample_sphere() - ray.o;

然后用递归循环：

递归结束条件就是光线一直追踪不到的时候，然后返回背景色，也就是他的else{}代码块。原文是将全球放到hitP上方，然后做一直递归循环直到收敛。

由于GLSL不能写递归，所以只能一次获取一个sampler来获取环境。当然噪点满屏

#version 450 core
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;

float PI = 3.1415926;
struct Ray{
    vec3 o;
    vec3 d;
};

struct Sphere{
    vec3 c;
    float r;
    vec3 Cd;
};
struct Plane{
    vec3 P; // cross this plane
    vec3 N;
    vec3 Cd;

};

struct RayHit{
    vec3 hitP;
    vec3 hitN;
    float t;// o + td
    vec3 hitCd;
};


struct Scene{
    Sphere spheres[3];
    Plane plane;
};






float rand(float n){return fract(sin(n) * 43758.5453123);}


bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){
    float t;
    vec3 temp = ray.o - sphere.c;
    float A = dot(ray.d, ray.d);
    float B = 2 * dot(temp, ray.d);
    float C = dot(ray.o-sphere.c, ray.o-sphere.c) - sphere.r * sphere.r;
    float disc = B * B - 4.0f * A * C;


    if (disc <0.0)
    return false;
    else {
        float e = sqrt(disc);
        float denom = 2.0 * A;
        t = (-B -e) / denom;// smaller root
        if (t > tmin && t< tmax){
            hitRecord.t = t;
            hitRecord.hitP = ray.o + t*ray.d;
            hitRecord.hitN = normalize(hitRecord.hitP - sphere.c);
            hitRecord.hitCd = sphere.Cd;
            return true;
        }
        t = (-B + e) / denom;// large root
        if (t > tmin && t<tmax){
            hitRecord.t = t;
            hitRecord.hitP = ray.o + t*ray.d;
            hitRecord.hitN = normalize(hitRecord.hitP - sphere.c);
            hitRecord.hitCd = sphere.Cd;
            return true;
        }
    }
    return false;
}

bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){
    vec3 n = normalize(plane.N);
    double denom = dot(ray.d,n);
    double t = dot((plane.P - ray.o) , n) / denom;
    if(t > tmin && t< tmax){
        hitRecord.t = float(t);
        hitRecord.hitP = ray.o + float(t)*ray.d;
        hitRecord.hitN = plane.N;
        hitRecord.hitCd = plane.Cd;
        return true;
    }
    return false;
}


bool worldHit(inout Ray ray,  inout Scene scene , float tmin , float tmax, inout RayHit rec){
    bool hit_anyting = false;
    float close_so_far = tmax;

    // raytracing plane
    RayHit temp_rec;
    if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){
        hit_anyting = true;
        close_so_far = temp_rec.t;
        rec = temp_rec;
    }

    // raytracing spheres
    for(int i=0;i<3;i++){
        if(raySphere(ray,scene.spheres[i],tmin,close_so_far,temp_rec)){
            hit_anyting = true;
            close_so_far = temp_rec.t;
            rec = temp_rec;
        }
    }

    return hit_anyting;
}




uint rand(inout uint state)
{
    uint x = state;
    x ^= x << 13;
    x ^= x >> 17;
    x ^= x << 15;
    state = x;
    return x;
}

// ------------------------------------------------------------------

float random_float_01(inout uint state)
{
    return (rand(state) & uint(0xFFFFFF)) / 16777216.0f;
}



uint g_state =123;  // global state for random





vec3 random_in_unit_sphere(inout uint state)
{
    float d, x, y, z;
    do {
        x = random_float_01(state) * 2.0 - 1.0;
        state +=1 ;
        y = random_float_01(state) * 2.0 - 1.0;
        state +=3 ;
        z = random_float_01(state) * 2.0 - 1.0;
        d = x*x + y*y + z*z;
    } while(d > 1.0);

    return vec3(x, y, z);
}


vec3 color(inout Ray ray, inout Scene scene){
    vec3 unit_direction = ray.d;
    float t = 0.5 *(unit_direction.y + 1.0);
    vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0);




    float tmax = 123213214; // default very far scene
    RayHit record; // Hit record
    if( worldHit(ray,scene,0.0, tmax, record )){

        vec3 target = record.hitP + record.hitN + random_in_unit_sphere(g_state);
        Ray tempRay;
        tempRay.o = record.hitP;
        tempRay.d = normalize(target - record.hitP);

        RayHit newRec;
        if(worldHit(tempRay,scene,0.0, tmax, newRec ) ) {  // if in shadow hit
            return vec3(0);
        }
        else{
            return background;  // else return back ground color
        }
        //return record.hitCd;
    }
    else
    {
        return background;
    }
}


void main() {

    const float pixel_size = 1.0f;
    ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500

    // base pixel colour for image
    vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
    // get index in global work group i.e x,y position
    ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);

    //
    // interesting stuff happens here later
    //



    // scene objs
    Sphere sphere1;
    sphere1.c = vec3(0, 0, -1.0);
    sphere1.r = 0.5;
    sphere1.Cd = vec3(1,0,0);

    Sphere sphere2;
    sphere2.c = vec3(-1.1, 0, -1.0);
    sphere2.r = 0.5;
    sphere2.Cd = vec3(0,1,0);

    Sphere sphere3;
    sphere3.c = vec3(1.1, 0, -1.0);
    sphere3.r = 0.5;
    sphere3.Cd = vec3(0,0,1);

    Plane plane1;
    plane1.P = vec3(0,-0.49,0);
    plane1.N = vec3(0,1,0);
    plane1.Cd = vec3(1,1,1);


    Scene scene;
    scene.plane = plane1;
    scene.spheres[0] = sphere1;
    scene.spheres[1] = sphere2;
    scene.spheres[2] = sphere3;


    int samples = 8;

    vec3 acccolor = vec3(0);


    for(int i=0;i<samples;i++){
        // Keep 1:1 aspect , that is a square image

        float ox = 0.0f;
        float oy = 0.0f;

        uint seedx = int( pixel_coords.x + i*132) * 231;
        uint seedy = int( pixel_coords.y + i*126) * 2311;
        ox = random_float_01(seedx);
        oy = random_float_01(seedy);


        float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x -1.0f ) + ox );
        float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y -1.0f ) + oy );

        float aspect = float(texsize.y) / float(texsize.x);

        x /= float(texsize.x)/2.0f;
        y /= float(texsize.y)/2.0f;

        x /= aspect;

        // define the ray
        Ray ray;
        ray.o = vec3(0, 0, 10.0f);               // o as the camera pos
        ray.d = vec3(x,y,0) - ray.o;             // d is viewplane.pos - camera.pos
        ray.d = normalize(ray.d);                //

        acccolor += color(ray, scene);

    }
    acccolor /= float(samples);


    //acccolor = random_in_unit_sphere(g_state);


    // output to a specific pixel in the image
    imageStore(img_output, pixel_coords, vec4(acccolor,1.0f));
}

View Code

代码的gstate 就是个全局变量对于每个像素来说都是属于一个像素内变量，所以可以使用它来追踪随机函数的值变化。

实际上(Fixing Shadow Acne)噪点的出现，经过我测试，由于raybias的原因，对代码稍微做修改(下面改了camera的origin)：

vec3 color(inout Ray ray, inout Scene scene){
    vec3 unit_direction = ray.d;
    float t = 0.5 *(unit_direction.y + 1.0);
    vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0);

    float bias = 0.0001f;
    float tmax = 123213214; // default very far scene
    RayHit record; // Hit record
    if( worldHit(ray,scene,0.0, tmax, record )){

        vec3 target = record.hitP + record.hitN + unit_sphere_sample(g_state);
        Ray tempRay;
        tempRay.o = record.hitP + record.hitN * bias;
        tempRay.d = normalize(target - record.hitP);

        RayHit newRec;
        if(worldHit(tempRay,scene,0.0, tmax, newRec ) ) {  // if in shadow hit
            return vec3(0.2);
        }
        else{
            return vec3(1.0);  // else return back ground color
        }
        //return record.hitCd;
    }
    else
    {
        return background;
    }
}

View Code

当然(Fixing Shadow Acne)也可以直接把tmin = 0.000001f;

RayHit newRec;
if(worldHit(tempRay,scene,0.000001f, tmax, newRec ) ) {  // if in shadow hit
　　return vec3(0.2);
}
else{
　　return vec3(1.0);  // else return back ground color
}

接下来用cos-weighted半球分布完成,这里面都是随机采样方法.

vec2 unit_square_sample( float i){
    //uint sx = i+152;
    //uint sy = i*100+317;
    //return vec2(rand(sx),rand(sy));
    float x = random(i + 10);
    float y = random(i + 10000);
    return vec2(x,y);
}

vec3 unit_sphere_sample(inout uint state)
{
    float d, x, y, z;
    do {
        x = rand(state) * 2.0 - 1.0;
        state +=1 ;
        y = rand(state) * 2.0 - 1.0;
        state +=3 ;
        z = rand(state) * 2.0 - 1.0;
        d = x*x + y*y + z*z;
    } while(d > 1.0);

    return vec3(x, y, z);
}




// Xi should use unit_square_sample() get
vec3 cos_weighted_hemisphere(vec2 Xi,float e){
    float cos_phi = cos(2.0f * PI * Xi.x);
    float sin_phi = sin(2.0f * PI * Xi.x);

    float cos_theta = pow((1.0f - Xi.y) , 1.0f / (e+1.0));
    float sin_theta = sqrt(1.0f - cos_theta * cos_theta);

    // Generation Z based sphere
    float pu = sin_theta * cos_phi;
    float pv = sin_theta * sin_phi;
    float pw = cos_theta;
    vec3 H = vec3(pu,pv,pw);
    return H;
}


// align the input vector to Normal
vec3 sample_on_normal(vec3 inputSample,vec3 N){
    // from tangent-space vector to world-space sample vector
    //vec3  up        = abs(N.z) < 0.999 ? vec3(0.0, 0.0, 1.0) : vec3(1.0, 0.0, 0.0);
    vec3 up = vec3(0.000001f,1.0,0);
    up = normalize(up);
    vec3 tangent   = normalize(cross(up, N));
    vec3 bitangent = cross(N, tangent);
    bitangent = normalize(bitangent);
    vec3 sampleVec = tangent * inputSample.x + bitangent * inputSample.y + N * inputSample.z;
    sampleVec = normalize(sampleVec);
    return sampleVec;
}

samples.glsl

uint randbit(inout uint state)
{
    uint x = state;
    x ^= x << 13;
    x ^= x >> 17;
    x ^= x << 15;
    state = x;
    return x;
}


// ------------------------------------------------------------------
// return 0-1
float rand(inout uint state)
{
    return float(randbit(state) & uint(0xFFFFFF)) / 16777216.0f;
}



// ----------------------------- HDK liked Noise -------------------------------------------------------

// A single iteration of Bob Jenkins' One-At-A-Time hashing algorithm.
uint hash( uint x ) {
    x += ( x << 10u );
    x ^= ( x >>  6u );
    x += ( x <<  3u );
    x ^= ( x >> 11u );
    x += ( x << 15u );
    return x;
}



// Compound versions of the hashing algorithm I whipped together.
uint hash( uvec2 v ) { return hash( v.x ^ hash(v.y)                         ); }
uint hash( uvec3 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z)             ); }
uint hash( uvec4 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w) ); }



// Construct a float with half-open range [0:1] using low 23 bits.
// All zeroes yields 0.0, all ones yields the next smallest representable value below 1.0.
float floatConstruct( uint m ) {
    const uint ieeeMantissa = 0x007FFFFFu; // binary32 mantissa bitmask
    const uint ieeeOne      = 0x3F800000u; // 1.0 in IEEE binary32

    m &= ieeeMantissa;                     // Keep only mantissa bits (fractional part)
    m |= ieeeOne;                          // Add fractional part to 1.0

    float  f = uintBitsToFloat( m );       // Range [1:2]
    return f - 1.0;                        // Range [0:1]
}



// Pseudo-random value in half-open range [0:1].
float random( float x ) { return floatConstruct(hash(floatBitsToUint(x))); }
float random( vec2  v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec3  v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec4  v ) { return floatConstruct(hash(floatBitsToUint(v))); }

// ----------------------------- HDK liked Noise -------------------------------------------------------


float PHI = 1.61803398874989484820459;  // Φ = Golden Ratio

float gold_noise(in vec2 xy, in float seed){
    return fract(tan(distance(xy*PHI, xy)*seed)*xy.x);
}

rand.glsl

#version 450 core
#extension GL_ARB_shading_language_include : require
layout(local_size_x = 1, local_size_y = 1) in;
layout(rgba32f, binding = 0) uniform image2D img_output;



#include "/shaders/raytracing/common/constant.glsl"  // first include constant
#include "/shaders/raytracing/common/rand.glsl"      // this have rand
#include "/shaders/raytracing/common/samples.glsl" // this will use the constant mem, rand()




struct Ray{
    vec3 o;
    vec3 d;
};

struct Sphere{
    vec3 c;
    float r;
    vec3 Cd;
};
struct Plane{
    vec3 P; // cross this plane
    vec3 N;
    vec3 Cd;

};

struct RayHit{
    vec3 hitP;
    vec3 hitN;
    float t;// o + td
    vec3 hitCd;
};


struct Scene{
    Sphere spheres[3];
    Plane plane;
};



bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){
    vec3 oc = ray.o- sphere.c;
    float a = dot(ray.d,ray.d);
    float half_b = dot(oc, ray.d);
    float c = dot(oc,oc) - sphere.r * sphere.r;
    float discriminant = half_b*half_b - a*c;

    if (discriminant > 0) {
        float root = sqrt(discriminant);
        float temp = (-half_b - root)/a;

        if (temp < tmax && temp > tmin) {
            hitRecord.t = float(temp);
            hitRecord.hitP = ray.o + hitRecord.t * ray.d;
            hitRecord.hitN = (hitRecord.hitP - sphere.c) / sphere.r;
            hitRecord.hitCd = sphere.Cd;
            return true;
        }

        temp = (-half_b + root) / a;
        if (temp < tmax && temp > tmin) {
            hitRecord.t = float(temp);
            hitRecord.hitP = ray.o + hitRecord.t * ray.d;
            hitRecord.hitN = (hitRecord.hitP - sphere.c) / sphere.r;
            hitRecord.hitCd = sphere.Cd;
            return true;
        }
    }
    return false;
}

bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){
    vec3 n = normalize(plane.N);
    double denom = dot(ray.d,n);
    double t = dot((plane.P - ray.o) , n) / denom;
    if(t > tmin && t< tmax){
        hitRecord.t = float(t);
        hitRecord.hitP = ray.o + float(t)*ray.d;
        hitRecord.hitN = plane.N;
        hitRecord.hitCd = plane.Cd;
        return true;
    }
    return false;
}


bool worldHit(inout Ray ray,  inout Scene scene , float tmin , float tmax, inout RayHit rec){
    bool hit_anyting = false;
    float close_so_far = tmax;

    // raytracing plane
    RayHit temp_rec;
    if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){
        hit_anyting = true;
        close_so_far = temp_rec.t;
        rec = temp_rec;
    }

    // raytracing spheres
    for(int i=0;i<3;i++){
        if(raySphere(ray,scene.spheres[i],tmin,close_so_far,temp_rec)){
            hit_anyting = true;
            close_so_far = temp_rec.t;
            rec = temp_rec;
        }
    }

    return hit_anyting;
}







uint g_state = 0;  // global state for random

const int cos_samples = 64;


vec3 color(inout Ray ray, inout Scene scene){
    vec3 unit_direction = ray.d;
    float t = 0.5 *(unit_direction.y + 1.0);
    vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0);
    float bias = .00001f;


    float tmax = 34132145; // default very far scene
    RayHit record; // Hit record
    if( worldHit(ray,scene,0.0, tmax, record )){
        vec3 cos_weighted_L = vec3(0);
        for(int i=0;i<cos_samples;i++){

            g_state += i;
            float x = rand(g_state);
            g_state += i;
            float y = rand(g_state);

            vec2 Xi = vec2(x,y);
            float cos_weight = 0.001f;
            vec3 z_hemisphere = cos_weighted_hemisphere(Xi, cos_weight);
            vec3 hemisphere = z_hemisphere;
            vec3 dir = sample_on_normal(hemisphere, record.hitN);
            // next ray tracing
            Ray tempRay;
            tempRay.o = record.hitP + record.hitN * bias ;
            tempRay.d = normalize(dir );

            RayHit newRec;
            if(worldHit(tempRay,scene,0.0, tmax, newRec ) ) {  // if in shadow hit
                cos_weighted_L += vec3(0);
            }
            else{
                cos_weighted_L += background;  // else return back ground color
            }
        }
        cos_weighted_L /= float(cos_samples);
        cos_weighted_L *= record.hitCd;  // add sphere default color
        return cos_weighted_L;
    }
    else
    {
        return background;
    }
}


void main() {

    const float pixel_size = 1.0f;
    ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500

    // base pixel colour for image
    vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0);
    // get index in global work group i.e x,y position
    ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy);

    //
    // interesting stuff happens here later
    //
    g_state = gl_GlobalInvocationID.x * 1973 + gl_GlobalInvocationID.y * 9277 + 2699 | 1;



    // scene objs
    Sphere sphere1;
    sphere1.c = vec3(0, 0, -1.0);
    sphere1.r = 0.5;
    sphere1.Cd = vec3(1,0,0);

    Sphere sphere2;
    sphere2.c = vec3(-1.1, 0, -1.0);
    sphere2.r = 0.5;
    sphere2.Cd = vec3(0,1,0);

    Sphere sphere3;
    sphere3.c = vec3(1.1, 0, -1.0);
    sphere3.r = 0.5;
    sphere3.Cd = vec3(0,0,1);

    Plane plane1;
    plane1.P = vec3(0,-0.49,0);
    plane1.N = vec3(0,1,0);
    plane1.Cd = vec3(1,1,1);


    Scene scene;
    scene.plane = plane1;
    scene.spheres[0] = sphere1;
    scene.spheres[1] = sphere2;
    scene.spheres[2] = sphere3;

    //genCosWeigtedSamples();
    int AA = 6;
    vec3 acccolor = vec3(0);


    for(int i=0;i<AA;i++){
        // Keep 1:1 aspect , that is a square image
        float ox = 0.0f;
        float oy = 0.0f;
        ox = rand(g_state);
        oy = rand(g_state);

        float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x -1.0f ) + ox );
        float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y -1.0f ) + oy );
        float aspect = float(texsize.y) / float(texsize.x);
        x /= float(texsize.x)/2.0f;
        y /= float(texsize.y)/2.0f;
        x /= aspect;

        // define the ray
        Ray ray;
        ray.o = vec3(0, 0, 5.0f);               // o as the camera pos
        ray.d = vec3(x,y,0) - ray.o;             // d is viewplane.pos - camera.pos
        ray.d = normalize(ray.d);                //
        acccolor += color(ray, scene);

    }
    acccolor /= float(AA);


    //acccolor = vec3(random(pixel_coords),random(pixel_coords+0.0001f),random(pixel_coords+0.001f) );

    acccolor = sqrt(acccolor);

    // output to a specific pixel in the image
    imageStore(img_output, pixel_coords, vec4(acccolor,1.0f));
}

shader

当然如上面(Fixing Shadow Acne)要解决raybias问题也可以直接设置tmin

vec3 color(inout Ray ray, inout Scene scene){
    vec3 unit_direction = ray.d;
    float t = 0.5 *(unit_direction.y + 1.0);
    vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0);


    //float bias = .000001f;

    float tmin = 0.00001f;
    float tmax = 34132145; // default very far scene
    RayHit record; // Hit record
    if( worldHit(ray,scene,tmin, tmax, record )){
        vec3 cos_weighted_L = vec3(0);
        for(int i=0;i<cos_samples;i++){

            g_state += i;
            float x = rand(g_state);
            g_state += i;
            float y = rand(g_state);

            vec2 Xi = vec2(x,y);
            float cos_weight = 0.001f;
            vec3 z_hemisphere = cos_weighted_hemisphere(Xi, cos_weight);
            vec3 hemisphere = z_hemisphere;
            vec3 dir = sample_on_normal(hemisphere, record.hitN);
            // next ray tracing
            Ray tempRay;
            tempRay.o = record.hitP  ;
            tempRay.d = normalize(dir );

            RayHit newRec;
            if(worldHit(tempRay,scene,tmin, tmax, newRec ) ) {  // if in shadow hit
                cos_weighted_L += vec3(0);
            }
            else{
                cos_weighted_L += background;  // else return back ground color
            }
        }
        cos_weighted_L /= float(cos_samples);
        cos_weighted_L *= record.hitCd;  // add sphere default color
        return cos_weighted_L;
    }
    else
    {
        return background;
    }
}

View Code

REF:http://antongerdelan.net/opengl/compute.html

Raytracing from ground up

https://www.scratchapixel.com/lessons/3d-basic-rendering/minimal-ray-tracer-rendering-simple-shapes/ray-plane-and-ray-disk-intersection

https://relativity.net.au/gaming/glsl/Functions.html

Raytracing in one weekend

https://gist.github.com/patriciogonzalezvivo/670c22f3966e662d2f83

rand:https://www.itranslater.com/qa/details/2126153185973765120,史上最好的rand，和houdini一样

https://github.com/diharaw/GPUPathTracer progressive rendering

相关阅读:
9.27作业
 9.20作业
 9.13作业
 9.6作业
 localstorage,sessionstorage使用
 oracle 客户端重新安装遇到的问题
 HTML5学习笔记(<command>,<details>,<dialog>)
HTML5学习笔记(HTML5新标签<bdi>)
html5学习笔记(html5新标签<aside>)
html5学习笔记(html5新标签<article>)
原文地址：https://www.cnblogs.com/gearslogy/p/12863318.html