Friday, June 24, 2011

CV - Camera model

Wonwoo Lee's webpage
http://sites.google.com/site/wleeprofile/home

http://old.uvr.gist.ac.kr/wlee/web/techReports/ar/Camera%20Models.html


 
Wonwoo’s 
augmented Reality Web page
 
Writer: Wonwoo Lee 
Last updated: 2006.10.18  
Contact: : wlee@gist.ac.krhttp://uvr.gist.ac.kr/wleemailto:wlee@gist.ac.krshapeimage_2_link_0shapeimage_2_link_1
AR ? Technical report : 
REal camera     VS  
Virtual camera how to  augment ?
 
1. Real Camera Model
  The real camera model has right-hand local coordinate system and it initially sees +z direction, (0,0,1), as shown in the figure below.
 
    The projection of 3D point onto image plane is done by the following procedure. We don’t explain projection procedure here. Please refer the book ‘Multiple View Geometry’ written by Hartley and Zisserman.  
 
 
    Mathematically, projection can be written as follows.
 
 
 
where, X=(X, Y, Z, 1)T and x=(U_real, V_real, 1)T
 
    The projection matrix P is,
 
    By multiplying P and X, we can obtain the projected point x.
 
 
    Finally, the projected point has following coordinates.
 
 
 
 
2. Virtual Camera Model
    For virtual camera, we follows OpenGL’s camera model which sees -Z direction initially.  The +Y direction is up vector of the camera.
 
  In OpenGL a 3D vertex is rendered on a screen through following procedure.
 
 
  The Modelview matrix transforms a 3D vertex to eye coordinates, which is the same as a camera’s local coordinates. Thus, modelview matrix corresponds to the extrinsic parameters of a real camera. Modelview matrix is 4x4 matrix and its 4th row is (0, 0, 0, 1).
  The projection matrix transforms the vertex in eye coordinate system to the clip coordinates. OpenGL projection matrix works as below. The parameter Near, Far, aspect, and fovy are for the function gluPerspective which sets the projection matrix.  
 
gluPerspective(fovy, aspect, Near, Far) ;
 
  The projection matrix transforms a vertex in eye coordinate system to the clip coordinate system.
 
 
  We don’t need to worry about ‘Perspective Division’ because OpenGL and graphics hardware handle it internally.
 
  The last is ‘Viewport Transformation’. If you took computer graphics course, you may know it. In OpenGL, the viewport transformation is set by the function glViewport.
 
glViewport(x0, y0, width, height) ;
 
  The corresponding transformation transforms the vertex in clip coordinate system to window coordinates. Here, s is a scale factor.  
 
 
 
  Since the extrinsic parameters of a real camera is the same as a modelview matrix of OpenGL, the transformation from world coordinate to the eye coordinate system is identical in both camera models.
 
  Writing the OpenGL projection model as matrix multiplication, starting in eye coordinates,
 
 
  Finally, we get the pixel coordinates of a 3D vertex in OpenGL.
 
 
 
 
3. Correspondences between two camera models
     By intuition, we can think that U_real equals to U_virtual and V_real equals to V_virtual. However, there exist ONE MORE THING we have to consider, the correspondences between coordinate systems between two models.
     Firstly, the initial viewing direction of the real  camera model is opposite to the virtual camera’s. The real camera sees (0, 0, 1) and the virtual sees (0, 0, -1). To make them match, we  multiply an scaling matrix before OpenGL projection matrix.
 
 
    Secondly, the pixel coordinate is also different in OpenGL and in real image. In the figure below, the origin is different so that we have to convert one of P1 and P2 as follows.
 
 
    By applying the scaling matrix and pixel coordinate calculation the new projection of a 3D vertex in OpenGL is as follows.
 
 
    Now, we are ready to compare both projection results.
    
 
 
 
    
    Finally we get parameters for rendering in OpenGL.
 
 
                
 
 
    Through these equations, we can calculate OpenGL parameters from the real camera parameters.
 
 
 
4. Experimental Result
    To examine the result, we rendered virtual objects on 4 calibrated images shown below. Camera calibration of these images is done by GML MATLAB Camera Calibration Toolbox available at http://graphics.cs.msu.ru/en/research/calibration/index.html. Actually, the images are sample images of the toolbox.
 
Calibrated image set
 
 
 
  
  
Augmentation results
 
 
    The augmentation results are quite reasonable visually. But how much error do we have here ? The error of projection can be calculated simply by calculating the difference between the  coordinates of the projected points with real camera parameters and with OpenGL’s. We can easily do forward projection in OpenGL by the function ‘gluProject’. It returns the projection of 3D vertex in floating value. We calculate the error in subpixel manner though OpenGL maps floating values to integer at last.
 
 
    The table below shows the mean and standard deviation of projection error.
 
 
   Where these errors come from ? Why there are errors even though we calculated the pixel coordinates in floating values ? The reason is that OpenGL receives some parameters as integer values only. The problem is viewport setting in OpenGL. The function ‘glViewport’ receives only integer values.
 
glViewport(int x_0, int y_0, int width, int height) ;
 
    Consequently, there must be truncation errors. For example, x_0 and y_0 for an image we used (Image 3, lower right one) are 2.663581 and 3.678256 respectively. You can see that the mean  errors in the table are almost the same as the value truncated when we cast x_0 and y_0 to integer.
 
카메라의 내부/외부 파라미터를 알고 있는 경우, 가상의 물체를 렌더링 해 넣을 때 OpenGL에서 관련된 파라미터를 어떻게 계산하는가에 대한 내용을 담고 있습니다.
  Augmented Reality (AR) becomes very popular in these days. There are many researches in this area and many of researchers and students uses ARToolkit, a library providing marker tracking and virtual object augmentation. ARToolkit is very easy to use.
  However, if you are working on or interested in AR and computer vision, you may want to make your own ARToolkit or you may want to augment virtual objects without ARToolkit. Actually, it is not a difficult problem for people working on 3D computer vision to augment a virtual object on an image, if we know the intrinsic and extrinsic parameters of the camera by which the image is taken. One thing bothering us is mapping the real camera parameters to the virtual camera parameters. Since the 3D Graphics libraries, such as OpenGL and Direct3D, do not use the same parameters for rendering a virtual scene, we cannot apply the intrinsic / extrinsic parameters of a camera directly to virtual object augmentation
  In this article we explain the relationship between the real and virtual camera parameters and provides our experimental result. We also provide a C++ class which provides functions converting real camera parameters to OpenGL parameters.
 
(We use OpenGL for virtual scene rendering because OpenGL is popular to almost AR researchers.)
5. Remaining works
    We don’t consider Lens Distortion here. It is our remaining work.
 
 
 
6. Source code
    Here’s our implementation of simple camera class with OpenGL and OpenCV. It provides some basic functions for projection and calculating OpenGL parameters. In zip file, we put a simple example for your convenience. The project is made by and tested with Microsoft .NET 2003. But the code may work any other platform where the required libraries are available, since it is written in simple C++ code.
    You can use our code freely only for non-commercial applications and research purpose. Feel free to contact us if there are any problems in our code. We will appreciate your comment.
 
    - Download : cameraExample.zip
    - Requirements : OpenGL, GLUT, OpenCV  
 
    
    Usage (Refer example code for more details)
        1. Declare an instance of the camera class.
        2. Set camera’s parameters and image size from a file or manually.
      
       In your rendering function (ex: RenderScene or display)
          3. Get OpenGL parameters.
        4. Apply it through ‘glViewport’, ‘gluPerspective’ and ‘glLoadMatrix’
        5. Draw your virtual objects.
 
 
 
7. References
[1] Ming Li, “Correspondence Analysis Between The Image Formation Pipelines of Graphics and Vision
 
 

3 comments:

  1. HI, the download link no longer exist, could you upload the source code again?

    ReplyDelete
  2. Hi, it is really a great tutorial thanks for all the detailed information. One thing i would like to comment about is, the given example zip file, i think is not available anymore. It would be more complete if you can provide another working link for the download. Good job.

    ReplyDelete
  3. Sorry for the link. I think the server is not working any more.. :(

    ReplyDelete