Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Monocular vision-based displacement sensing for shunting truck guidance

Schreiber, L. A. 2025. Monocular Vision-based Displacement Sensing for Shunting Truck Guidance. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/691b8e29-5ebd-46b3-a74d-0f1834193a59

Saved in:
Bibliographic Details
Main Author: Schreiber, Leon Amos
Other Authors: Venter, Gerhard
Format: Thesis
Language:English
Published: Stellenbosch University 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613907953123328
access_status_str Open Access
author Schreiber, Leon Amos
author2 Venter, Gerhard
author_browse Schreiber, Leon Amos
Venter, Gerhard
author_facet Venter, Gerhard
Schreiber, Leon Amos
author_sort Schreiber, Leon Amos
collection Thesis
description Schreiber, L. A. 2025. Monocular Vision-based Displacement Sensing for Shunting Truck Guidance. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/691b8e29-5ebd-46b3-a74d-0f1834193a59
format Thesis
id oai:scholar.sun.ac.za:10019.1/132273
institution Stellenbosch University (South Africa)
language English
last_indexed 2026-06-10T12:43:36.943Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Stellenbosch University
publisherStr Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/132273 Monocular vision-based displacement sensing for shunting truck guidance Schreiber, Leon Amos Venter, Gerhard Schreve, Kristiaan Stellenbosch University. Faculty of Engineering. Dept. of Mechanical and Mechatronic Engineering. Computer vision -- Industrial applications Trucks -- Automatic control Optical measurements UCTD Schreiber, L. A. 2025. Monocular Vision-based Displacement Sensing for Shunting Truck Guidance. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/691b8e29-5ebd-46b3-a74d-0f1834193a59 Thesis (MEng)--Stellenbosch University, 2025. ENGLISH ABSTRACT: Automation in the industry is becoming more and more prevalent. It leads to higher safety, lower cost, and decreased lead times. Shunting truck automation will be worthwhile since it will decrease lead times in large-scale yards such as harbours and distribution centres. Self-driving vehicles require depth information about their surroundings to orient themselves, which enables safe navigation around and to obstacles if needed. LiDAR is used extensively in industrial applications since it provides the automated vehicle with an accurate point cloud of depth information, but it is costly. Existing techniques other than LiDAR to obtain depth information combine different sensor inputs such as ultrasonic and infrared time of flight sensors. The combination of sensor inputs adds complexity and often leads to higher costs. This work presents a method to estimate the angle and distance from the camera to a container face using a single camera. The method aims to find a cost-effective way to accurately find depth information to automate the shunting truck and trailer hooking process. The proposed method uses the projective transformation between computer identified points on a container face and pre-determined ground truth points to find the displacement after a homography decomposition is performed. Open-source software is used to perform the projective transformation calculations and homography decomposition. The aim is to gather adequate information to pass to a control system as the sole sensory input to automate a shunting truck to perform the trailer hooking process. Two methods were tested to obtain key points on a container face. The first method combined two YOLOv5 networks: one custom-trained to detect containers and the second using the first network’s output to detect lock holes as key points. The second method involves training a YOLOv7-pose detection model on a custom container dataset. A YOLOv7-pose detection model is trained on a custom container dataset. YOLOv7-pose is modified from its original human detection 17 key point state to operate on only four key points. The modified YOLOv7-pose automatically obtains key points on the container face corner castings to calculate the projective transformation (homography) between the image and actual container face. These key points are used to calculate the homography between them and the container corner casting size ground truth key points. The homography is used to calculate the camera distance and angular displacement from the container face. The core idea behind this method is its ability to calculate distances between any chosen camera and a flat-faced target object, as long as the object has a known, constant size. To achieve this, a neural network customised for the object’s geometry would need to be trained, allowing the model to identify key points necessary for computing the homography. The results were collected over a testing distance range of 1.50m to 5.00m and an angle range from −17.5° to +17.5°. The model demonstrated average distance errors of 0.064m and 0.026m and maximum distance errors of 0.370m and 0.065m for the two neural networks used during testing. For angle measurements, the model achieved average errors of 2.14° and 1.56° and maximum errors of 9.78° and 4.25°. At the maximum testing distance of 5.00 m, the calculated combination of distance and angular errors was 0.136 m, which meets the 0.250m requirement. AFRIKAANSE OPSOMMING: Outomatisering in die industrie word meer algemeen omdat dit beter veiligheid, laer kostes en korter omkeertye tot gevolg het. Die outomatisering van rangeertrokke deur middel van selfryvoertuie kan tot korter omkeertye op grootskaalse terreine soos hawens en verspreidingsentrums lei. Selfryvoertuie benodig omgewingsdieptedata om hulleself te oriënteer, wat veilige navigasie rondom hindernisse en wahaak-aksies moontlik maak. Li-DAR word op grootskaal in industriële toepassings gebruik en dit verskaf die geoutomatiseerde voertuig met akkurate dieptedata. Die hoë onkoste is wel nadelig. Buiten LiDAR kombineer betstaande diepteversamelingstegnieke dikwels verskillende sensors soos ultrasoniese en infrarooisensors. Dié kombinasie van insette verhoog die kompleksiteit en lei dikwels tot hoër onkostes. Hierdie tesis bied ’n metode om die hoek en afstand tussen die rangeertrok en ’n skeepshouervoorkant te skat deur die gebruik van ’n enkelkamera. Dié metode poog om op ’n koste-effektiewe manier akkurate dieptedata te verskaf vir die geoutomatiseerde haak van ’n sleepwa. Die voorgestelde metode gebruik projeksietransformasie tussen rekenaargeïdentifiseerde punte op ’n skeepshouervoorkant en voorafbepaalde waarheidspunte om die verplasing te bepaal deur middel van ’n homografie-ontbinding. Oopbronkode word gebruik om die projektiewe transformasie-berekeninge en homografie-ontbinding uit te voer. Die doel is om genoegsame inligting te versamel om aan ’n beheerstelsel te verskaf as die enigste sensoriese inset sodat ’n trok ’n sleepwa outomaties kan haak. Twee metodes is getoets om sleutelpunte op ’n skeepshouervoorkant te bekom. Die eerste metode het twee YOLOv5-netwerke kombineer: een wat spesiaal geleer is om skeepshouers te identifiseer en een wat die eerste netwerk se resultaat gebruik om skeepshouerhoekgietstukke as sleutelpunte te identifiseer. Die tweede metode behels die leer van ’n YOLOv7-pose oriëntasiebepalingsmodel op ’n pasgemaakte skeepshouer datastel. YOLOv7-pose is vanaf die oorspronklike 17 sleutelpuntopstelling gewysig om slegs met vier sleutelpunte te werk. Die gewysigde YOLOv7-pose-netwerk identifseer outomaties sleutelpunte op die hoekgietstukke van die skeepshouervoorkant om die projeksietransformasie (homografie) tussen die beeld en die werklike skeepshouervoorkant te bereken. Hierdie sleutelpunte word gebruik om die homografie tussen hulle en die waarheidsleutelpunte van die hoekgietstukke van die houer te bereken. Die homografie word gebruik om die kamera se afstand en hoekverplasing vanaf die houerfront te bereken. Hierdie metode se kernidee is die vermoë om afstande te bereken tussen enige gekose kamera en ’n plat teiken, solank die voorwerp ’n bekende, konstante grootte het. Om dit te bereik, sal ’n neurale netwerk wat vir die voorwerp se geometrie aangepas is, geleer moet word sodat die model die nodige sleutelpunte vir die berekening van die homografie kan identifiseer. Die resultate is ingesamel oor ’n toetsafstand van 1.50m tot 5.00m en ’n hoekomvang van −17.5° tot +17.5°. Die model het gemiddelde afstandsfoute van 0.064m en 0.026m en maksimum afstandsfoute van 0.370m en 0.065m getoon vir die twee neurale netwerke wat tydens die toets gebruik is. Die model het gemiddelde hoekmetingsfoute van 2.14° en 1.56° en maksimum foute van 9.78° en 4.25° behaal. By die maksimum toetsafstand van 5.00 m, was die berekende kombinasie van die afstand- en hoekfoute 0.136 m, wat aan die 0.250m vereiste voldoen. Masters 2025-06-02T09:51:40Z 2025-06-02T09:51:40Z 2025-03 Thesis https://scholar.sun.ac.za/handle/10019.1/132273 en application/pdf Stellenbosch University
spellingShingle Computer vision -- Industrial applications
Trucks -- Automatic control
Optical measurements
UCTD
Schreiber, Leon Amos
Monocular vision-based displacement sensing for shunting truck guidance
title Monocular vision-based displacement sensing for shunting truck guidance
title_full Monocular vision-based displacement sensing for shunting truck guidance
title_fullStr Monocular vision-based displacement sensing for shunting truck guidance
title_full_unstemmed Monocular vision-based displacement sensing for shunting truck guidance
title_short Monocular vision-based displacement sensing for shunting truck guidance
title_sort monocular vision based displacement sensing for shunting truck guidance
topic Computer vision -- Industrial applications
Trucks -- Automatic control
Optical measurements
UCTD
url https://scholar.sun.ac.za/handle/10019.1/132273
work_keys_str_mv AT schreiberleonamos monocularvisionbaseddisplacementsensingforshuntingtruckguidance