Looking for objects in a scene is no doubt a difficult task. Object recognition works alright, but it is resource-demanding and if the environment is complex or the object is not unique in the scene it can lead to a lot of errors.
A simpler solution to detect and track an object (or surface) is to attach unique markers on it and track those instead!
ArUco markers (like the ones used in the example below) are widely used in augmented reality situations to reliably track surfaces. Whatever the application is, this is a simple way to track a marker using Python and OpenCV.
In the example below, the program looks for the marker frame-by-frame, gets its location and orientation by calculating the homography transformation matrix relative to the camera plane and then it corrects the rectangle highlight of the surface being tracked.
Here is a walkthrough of the source code, which is available on GitHub:
A little helper function “which” to look for the items provided by the second argument “values” in the items provided by the first argument “x”, returning the indices of common items relative to the first argument:
def which(x, values): indices =  for ii in list(values): if ii in x: indices.append(list(x).index(ii)) return indices
The “while” loop that gets the frames of a video provided one at a time for analysis. After getting the frame, converting it to gray scale (for more efficient processing) we can detect present ArUco markers in the frame:
res_corners, res_ids, _ = cv2.aruco.detectMarkers(gray, aruco_dict, parameters = parameters)
It is very important to prepare the array with all marker corners correctly; in the code we do this by flattening the array of marker corners in this line:
these_res_corners = np.concatenate(res_corners, axis = 1) these_ref_corners = np.concatenate([refCorners[x] for x in idx], axis = 1)
This is necessary because, when detecting ArUco markers, the array returned by the “detectMarkers” function is three dimensional, with the corners of each detected marker grouped separately. For the “findHomography” function to work, all corners need to be grouped together in the same dimension of the array.
Now we can calculate the homography matrix from the markers detected in the frame relative to the reference image:
h, s = cv2.findHomography(these_ref_corners, these_res_corners, cv2.RANSAC, 5.0)
Next, using the estimated homography matrix we can transform the perspective of the bounding box from reference image to the frame and draw it around the detected object:
newRect = cv2.perspectiveTransform(rect, this_h, (gray.shape,gray.shape)) frame = cv2.polylines(frame, np.int32(newRect), True, (0,0,0), 10)
I am planning other articles on marker tracking and how it can be used to estimate perspective in both camera and world coordinates. Stay tuned for more on that topic!
In the mean time, if you have any comment or question, or would like to apply such a solution in your project, feel free to contact me on this form or on social media links below.
Edit (11.12.2019): I have added a walkthrough the source code and updated the source on Github.