Date of Award
Master of Science (MS)
Birchfield , Stanley
Hoover , Adam
This thesis examines the problem of pose estimation, which is the problem of determining the pose of an object in some coordinate system. Pose refers to the object's position and orientation in the coordinate system. In particular, this thesis examines pose estimation techniques using either monocular or binocular vision systems.
Generally, when trying to find the pose of an object the objective is to generate a set of matching features, which may be points or lines, between a model of the object and the current image of the object. These matches can then be used to determine the pose of the object which was imaged. The algorithms presented in this thesis all generate possible matches and then use these matches to generate poses.
The two monocular pose estimation techniques examined are two versions of SoftPOSIT: the traditional approach using point features, and a more recent approach using line features. The algorithms function in very much the same way with the only difference being the features used by the algorithms. Both algorithms are started with a random initial guess of the object's pose. Using this pose a set of possible point matches is generated, and then using these matches the pose is refined so that the distances between matched points are reduced. Once the pose is refined, a new set of matches is generated. The process is then repeated until convergence, i.e., minimal or no change in the pose. The matched features depend on the initial pose, thus the algorithm's output is dependent upon the initially guessed pose. By starting the algorithm with a variety of different poses, the goal of the algorithm is to determine the correct correspondences and then generate the correct pose.
The binocular pose estimation technique presented attempts to match 3-D point data from a model of an object, to 3-D point data generated from the current view of the object. In both cases the point data is generated using a stereo camera. This algorithm attempts to match 3-D point triplets in the model to 3-D point triplets from the current view, and then use these matched triplets to obtain the pose parameters that describe the object's location and orientation in space.
The results of attempting to determine the pose of three different low texture manufactured objects across a sample set of 95 images are presented using each algorithm. The results of the two monocular methods are directly compared and examined. The results of the binocular method are examined as well, and then all three algorithms are compared. Out of the three methods, the best performing algorithm, by a significant margin, was found to be the binocular method. The types of objects searched for all had low feature counts, low surface texture variation, and multiple degrees of symmetry. The results indicate that it is generally hard to robustly determine the pose of these types of objects. Finally, suggestions are made for improvements that could be made to the algorithms which may lead to better pose results.
Kriener, Robert, "A Comparison and Evaluation of Three Different Pose Estimation Algorithms In Detecting Low Texture Manufactured Objects" (2011). All Theses. 1290.