使用校准台进行相机校准

如图 1 所示,我们可以使用校准台进行相机校准。我们可以规定世界坐标系,即图中的 \(O_w,i_w,j_w,k_w\),从而可以规定棋盘点的各个世界坐标系下的坐标。同时,从拍摄图像中,我们可以获取到对应点的像素坐标。这样就能求解两者的变换关系。

图1 校准台

我们最终要求的就是投影变换矩阵 \(M = K \cdot \begin{bmatrix} R & \vert & t \end{bmatrix}\),即相机的内参和外参。写的更繁琐一点就是:

\(M = \begin{bmatrix} m_{11} & m_{12} & m_{13} & m_{14} \\ m_{21} & m_{22} & m_{23} & m_{24} \\ m_{31} & m_{32} & m_{33} & m_{34} \end{bmatrix}\)

即最终求解这 12 个参数。

世界坐标系下的点为 \(P = \begin{bmatrix} X \\ Y \\ Z \\ 1 \end{bmatrix} \),投影公式为

\(s \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = M \cdot \begin{bmatrix} X \\ Y \\ Z \\ 1 \end{bmatrix} \)

可以得到

\(s \cdot x = m_{11} X + m_{12} Y + m_{13} Z + m_{14}\)

\(s \cdot y = m_{21} X + m_{22} Y + m_{23} Z + m_{24}\)

\(s = m_{31} X + m_{32} Y + m_{33} Z + m_{34}\)

整理写成方程的形式

\(\begin{bmatrix} X_i & Y_i & Z_i & 1 & 0 & 0 & 0 & 0 & -x_i X_i & -x_i Y_i & -x_i Z_i & -x_i \\ 0 & 0 & 0 & 0 & X_i & Y_i & Z_i & 1 & -y_i X_i & -y_i Y_i & -y_i Z_i & -y_i \end{bmatrix} \cdot \begin{bmatrix} m_{11} \\ m_{12} \\ m_{13} \\ m_{14} \\ m_{21} \\ m_{22} \\ m_{23} \\ m_{24} \\ m_{31} \\ m_{32} \\ m_{33} \\ m_{34} \end{bmatrix} = 0 \)

这样问题就转换为方程求解的最优化问题。一个点对应两条方程,即最少需要 6 个点可以求解。

求解得到 \(M\) 之后,需要再将其分解为内参和外参的形式。

怎么求解暂不做深究,之后知识储备够了再看。

实验

虽然原理细节不清楚,我们还是能实验感受一下。现实生活中没有布置校准台的条件,如图 2 所示,在 UE 中模拟布置了一下。使用的相机 fov 为 60,输出大小为 1920x1080。

图2 校准装置图片

如代码清单 1 所示,我们手动设置世界坐标和像素坐标。然后如图 3 所示,核对一下手动标注的位置是否正确。

代码清单 1 坐标点
  1. # Load the image
  2. image = cv2.imread("RenderOutput.png")
  3.  
  4. # Flip color channels for matplotlib compatibility
  5. image_rgb = np.flip(image, -1)
  6.  
  7. # 3D coordinates
  8. w_coor = np.array([
  9.     [8, 0, 9, 1], [8, 0, 1, 1], [6, 0, 1, 1], [6, 0, 9, 1],
  10.     [5, 1, 0, 1], [5, 9, 0, 1], [4, 9, 0, 1], [4, 1, 0, 1],
  11.     [0, 4, 7, 1], [0, 4, 8, 1], [0, 8, 3, 1], [0, 8, 7, 1]
  12. ])
  13.  
  14. # Camera coordinates
  15. c_coor = np.array([
  16.     [458, 113], [463, 678], [558, 648], [563, 119], [647, 704],
  17.     [1105, 831], [1134, 809], [687, 688], [994, 271], [999, 209],
  18.     [1195, 545], [1220, 296]
  19. ])
  20.  
  21. # Coordinates to validate if M is correct or not
  22. w_check = np.array([
  23.     [6, 0, 5, 1], [3, 3, 0, 1], [0, 4, 0, 1], [0, 4, 4, 1], [0, 0, 7, 1]
  24. ])
  25. c_check = np.array([
  26.     [561, 393], [822, 699], [966, 668], [982, 448], [796, 249]
  27. ])
  28.  
  29. # Plot original image
  30. plt.figure(figsize=(10, 8))
  31. plt.imshow(image_rgb)
  32. plt.title("Original Image")
  33. plt.axis('off')
  34. plt.show()
  35.  
  36. # Plot image with camera coordinates and 3D annotations
  37. plt.figure(figsize=(10, 8))
  38. plt.imshow(image_rgb)
  39.  
  40. # Plot primary camera coordinates with 3D annotations (red)
  41. for i, point in enumerate(c_coor):
  42.     plt.scatter(point[0], point[1], color='red', s=50)
  43.     plt.text(point[0] + 5, point[1] - 5, f"{w_coor[i, :3]}", color='white', fontsize=8)
  44.  
  45. # Plot check camera coordinates with 3D annotations (blue)
  46. for i, point in enumerate(c_check):
  47.     plt.scatter(point[0], point[1], color='blue', s=50)
  48.     plt.text(point[0] + 5, point[1] - 5, f"{w_check[i, :3]}", color='yellow', fontsize=8)
  49.  
  50. plt.title("Image with Camera Coordinates and Check Points")
  51. plt.axis('off')
  52. plt.show()
  53.  
  54. print(f"Shape of 3D coordinates (P): {w_coor.shape} (rows: {w_coor.shape[0]}, columns: {w_coor.shape[1]})")
  55. print(f"Shape of camera coordinates (p): {c_coor.shape} (rows: {c_coor.shape[0]}, columns: {c_coor.shape[1]})")
  56. print(f"Shape of validation camera coordinates (c_check): {c_check.shape} (rows: {c_check.shape[0]}, columns: {c_check.shape[1]})")
图3 手动标注

接着如代码清单 2 所示,构造求解矩阵,并求解。

代码清单 2 求解
  1. i = 0
  2. P = np.empty([12, 12], dtype=float)
  3.  
  4. while i < 12:
  5.     c = i // 2
  6.  
  7.     p1 = w_coor[c]
  8.     p2 = np.array([0, 0, 0, 0])
  9.     if i % 2 == 0:
  10.         p3 = -p1 * c_coor[c][0]
  11.         P[i] = np.hstack((p1, p2, p3))
  12.     elif i % 2 == 1:
  13.         p3 = -p1 * c_coor[c][1]
  14.         P[i] = np.hstack((p2, p1, p3))
  15.     i += 1
  16.  
  17. print(f"The matrix P has the shape: {P.shape} (rows: {P.shape[0]}, columns: {P.shape[1]})")
  18. print(f"The first row of P is: {P[0]}")
  19.  
  20. # Use SVD to decompose matrix P
  21. U, sigma, VT = LA.svd(P)
  22.  
  23. # Print the dimensions of the decomposition
  24. print(f"Shape of U (orthogonal matrix): {U.shape}")
  25. print(f"Shape of Sigma (diagonal values): {sigma.shape}")
  26. print(f"Shape of V (orthogonal matrix): {VT.shape}")
  27.  
  28. # V is the transpose of VT
  29. V = np.transpose(VT)
  30.  
  31. # Extract the last column of V (corresponding to the smallest singular value vector), a scalar multiple of the projection matrix M
  32. preM = V[:, -1]
  33.  
  34. # Reshape to a 3x4 form to approximate the projection matrix M
  35. roM = preM.reshape(3, 4)
  36.  
  37. # Print the result of the projection matrix M
  38. print("A scalar multiple of the projection matrix M, stored as roM:")
  39. print(roM)

接着如代码清单 3 所示,对求解得到的矩阵进行分解,得到相机的内参和外参。

代码清单 3 分解
  1. # Extract the matrix A and vector b from roM
  2. A = roM[0:3, 0:3].copy()  # Extract the 3x3 matrix A (corresponding to the [R] part)
  3. b = roM[0:3, 3:4].copy()  # Extract the 3x1 vector b (corresponding to the [T] part)
  4.  
  5. # Print the results of A and b, and clearly explain their meanings
  6. print("The projection matrix M can be written in the form [A | b], where:")
  7. print("- A is a 3x3 matrix containing the combined intrinsic and rotation matrix (K * R).")
  8. print("- b is a 3x1 vector representing the intrinsic and translation relationship (K * T).")
  9. print("\nHere are the results:")
  10. print("A (3x3):")
  11. print(A)
  12. print("b (3x1):")
  13. print(b)
  14.  
  15. # Extract the transpose of a3 from matrix A (corresponding to the a3 vector in the formula)
  16. a3T = A[2]  # The third row vector a3
  17.  
  18. # Calculate ρ (rho), corresponding to the formula: ρ = ±1 / ||a3||
  19. under = LA.norm(a3T)  # Norm of ||a3||
  20. ro01 = 1 / under  # Calculation of ρ
  21. print("The value of rho (ρ) is: %f" % ro01)
  22.  
  23. # Calculate cx and cy, corresponding to the formulas:
  24. # cx = ρ^2 * (a1 a3)
  25. # cy = ρ^2 * (a2 a3)
  26. a1T = A[0]  # The first row vector a1
  27. a2T = A[1]  # The second row vector a2
  28. cx = ro01 * ro01 * (np.dot(a1T, a3T))  # Dot product calculation for cx
  29. cy = ro01 * ro01 * (np.dot(a2T, a3T))  # Dot product calculation for cy
  30. print("The computed values are: cx=%f, cy=%f" % (cx, cy))
  31.  
  32. # Calculate θ (theta), corresponding to the formula:
  33. # θ = cos^(-1)(-((a1 × a3) (a2 × a3)) / (||a1 × a3|| * ||a2 × a3||))
  34. a_cross13 = np.cross(a1T, a3T)  # Cross product calculation for a1 × a3
  35. a_cross23 = np.cross(a2T, a3T)  # Cross product calculation for a2 × a3
  36. theta = np.arccos((-1) * np.dot(a_cross13, a_cross23) /
  37.                   (LA.norm(a_cross13) * LA.norm(a_cross23)))  # Calculation of θ
  38. print("The computed value of theta (θ) is: %f" % theta)
  39.  
  40. # Calculate α (alpha) and β (beta), corresponding to the formulas:
  41. # α = ρ^2 * ||a1 × a3|| * sin(θ)
  42. # β = ρ^2 * ||a2 × a3|| * sin(θ)
  43. alpha = ro01 * ro01 * LA.norm(a_cross13) * np.sin(theta)  # Calculation of α
  44. beta = ro01 * ro01 * LA.norm(a_cross23) * np.sin(theta)  # Calculation of β
  45. print("The computed values are: alpha=%f, beta=%f" % (alpha, beta))
  46.  
  47. # Construct the intrinsic matrix K, according to the formula:
  48. # K = [[α, -α * cot(θ), cx],
  49. #      [0, β / sin(θ), cy],
  50. #      [0, 0, 1]]
  51. K = np.array([
  52.     [alpha, -alpha * (1 / np.tan(theta)), cx],  # First row
  53.     [0, beta / np.sin(theta), cy],              # Second row
  54.     [0, 0, 1]                                   # Third row
  55. ])
  56. K = K.reshape(3, 3)  # 3x3 intrinsic matrix
  57. print("The computed intrinsic matrix K is: ")
  58. print(K)
  59.  
  60. # Calculate R
  61. r1 = a_cross23 / LA.norm(a_cross23)
  62. r301 = ro01 * a3T
  63. r2 = np .cross(r301, r1)
  64. R = np.hstack((r1, r2, r301))
  65. R = R.reshape(3, 3)
  66. print("We can get R:")
  67. print(R)
  68.  
  69. # Calculate T
  70. T = ro01 * np.dot(LA.inv(K), b)
  71. print("We can get T:")
  72. print(T)

最后我们结合测试的点,验证计算的结果是否正确。

  1. my_size = c_check.shape[0]
  2. my_err = np.empty([my_size])
  3. pred_points = []
  4. for i in range(my_size):
  5.     test_pix = np.dot(roM, w_check[i])
  6.     u = test_pix[0] / test_pix[2]
  7.     v = test_pix[1] / test_pix[2]
  8.     u_c = c_check[i][0]
  9.     v_c = c_check[i][1]
  10.     print(f"You get the test point {i} with the result ({u:.2f},{v:.2f})")
  11.     print(f"The correct result is ({u_c:.2f},{v_c:.2f})")
  12.     my_err[i] = (abs(u - u_c) / u_c + abs(v - v_c) / v_c) / 2
  13.     pred_points.append((u, v))

从打印信息可以看到,结果很接近。

  • You get the test point 0 with the result (560.42,396.71)
  • The correct result is (561.00,393.00)
  • You get the test point 1 with the result (820.42,699.49)
  • The correct result is (822.00,699.00)
  • You get the test point 2 with the result (964.16,668.92)
  • The correct result is (966.00,668.00)
  • You get the test point 3 with the result (984.60,446.89)
  • The correct result is (982.00,448.00)
  • You get the test point 4 with the result (799.38,250.77)
  • The correct result is (796.00,249.00)

我们回过头来看计算得到的内参矩阵,可以发现和 UE 理论模拟的值不一样。

  • The computed intrinsic matrix K is:
  • [[ 1.60606238e+03 -3.41091605e+01  8.23986765e+02]
  •  [ 0.00000000e+00  1.57353395e+03  6.53139245e+02]
  •  [ 0.00000000e+00  0.00000000e+00  1.00000000e+00]]

实验感受下来,觉得这个流程可能有点问题,虽然测试点能对应上,但是还是在相同视角计算的,只能保证计算得到的 M 是正确的。特别是分解内外参这步如何保证唯一比较疑惑。留作问题。