Tracking in 3D: Computer Vision

Suppose you get a bunch of oishi features in 2D images, how to get 3D coordinates from them?

Very briefly, you typically use the following code

import fish_3d as f3
import fish_track as ft

clusters_nv = []  # nv = n view

for i in range(n_views):
    clusters = ft.oishi.get_clusters(
        features_nv[i], kernels_nv[i], angles
    )
    clusters_nv.append(clusters)

matched_indices, matched_centres, reproj_errors = f3.three_view_cluster_match(
    clusters_nv, cameras, tol_2d, sample_size, depth
)

The matched_centres is the 3D coordinates constructed from many 2D features.

But what is happening? What are the meanings of those variables? What is the business happening inside the code?

Here is an detailed explaination.

Synchronised Cameras

under construction …

Epipolar Geometry

under construction …

Camera Distortion

under construction …

Water refraction

under construction …

Cost function

under construction …