Skip to main content

DESKBOT Update 5



YOLO - You Only Look Once 

Other methods use classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. The high scoring regions of the image are considered detection.  
  
In YOLO the approach is quite different. Neural network is applied for the full image and this network divides the full image into regions and predicts a bounding box or bounding boxes. These bounding boxes are given weights by the predicted probability.  
  
We were supposed to use MaskrCNN to get better accuracy but due to the COVID 19 situation and work from home we do not have a good computing system and GPU resources therefore we have chosen to work with YOLO. YOLO is faster and gives a boundary boxes for the objects detected.  
  
YOLO works on COCO dataset, which is trained on 80 different classes. YOLO is not trained for detecting pens, staplers, erasers and many other objects which we are concerned in our case. Therefore we have collected a custom dataset to train a YOLO model to detect the required classes in our case.  
  
First we have used a intel realsense to collect about 70 images fro training. Later we have annotated using an annotation tool for the 70 images. Once annotated with the help of darknet we have trained a model for about 10000 iterations. Here are a few results below from the test set after training the model.  
  
In this we have used pyrealsense2 a python package to integrate a real sense camera with python. With the help of pyrealsense we can get frames from the real sense camera and we can convert them to a numpy array in opencv.   


 
Figure 1:- Top image is a raw image with objects and the bottom images is the same images with detected objects.  




FIgure2: This is another example, Top image is the raw image and the bottom image with objects detected using the model.

Detecting Hamster Robot using ArUco Markers: 

What are ArUco Markers:  

ArUco markers were originally developed in 2014 by S.Garrido-Jurado et al., in their work "Automatic Generation and detefction of highly reliable fiducial markers under occlusion“. ArUco markers are small 2D barcodes. Each ArUco marker corresponds to a number, encoded into a small grid of black and white pixels. ArUco markers are very useful as tags for many robotics and augmented reality applications. For example, one may place an ArUco next to a robot's charging station, an elevator button, or an object that a robot should manipulate. 
In our project we use Ar Markers to detect the Hamster robot and its position with the help of the Intel RealSense Camera. While the YOLO Object Detector detects objects on the table the robot itself is detected using Ar Marker detection.  Open CV is used to detect the Ar Markers. The Open CV Ar detector provides us with a bounding box and corner position for every unique Ar marker. Some example outputs are shown below, 





[1] Consists of the ArUco API required to detect Ar Markers.  

Next Week Update:
Now that we have the ability to detect both the objects and the Hamster Robots through vision we can perform path planning for the same. A* and Jump Point Search will be implemented over this week to bring the whole project together so that the robots can push objects to their respective slots based off feedback received through vision.  

Reference:  






Comments

Popular posts from this blog

DESKBOT Update 7 - Final Update !

Final Overview & Results  The DeskBot system was setup up on an experimental desk testbed. The Intel Real sense camera was mounted directly above the center of the table at a distance of 30 inches. The right corner of the table is the robots initial resting location. Three major modules were used as explained in the updates previously, Scene Segmentation and Object Detection Robot Path Planning and Object Manipulation Coverage Path Planning The results pertaining to each of these modules are detailed below, Scene Segmentation and Object Detection To help the DeskBot system perceive and observe its environment YOLO object detector was implemented to classify 5 classes (pencil, erasers/rubbers, pens, staplers, remotes). YOLO was successfully implemented with an accuracy of 90%. A dataset was also created an annotated for the same calss of objects for training. As a future work more desk/workspace objects could be trained to be seamlessly decluttered an

DESKBOT Update 4

The Hamster Robot and Coverage Path Planning: Hamster Robot: The Hamster robot is a small and lovely robot for software education. It includes various devices as shown in the following figures. The Hamster robot can be programmed and controlled over various languages such as Python, C, Processing IDE etc. For our implementation, we plan to program the robot using python. The Hamster robot will be used in three ways: For locomotion i.e. to push objects to their slots during cleaning and moving during Coverage Path Planning (CPP). This will require control of the DC motors. For detecting the edge of the workspace (desk in our case). It is crucial that during robot locomotion we do not fall off the desk. As a safety measure for this purpose the Infrared Floor Sensors will be constantly polled to look for the edge of the workspace to stop the robot. For detecting, if there exists a contact between the robot and the object. The proximity sensors will be used and wil

DESKBOT UPDATE 3

Scene Segmentation and Object Detection The robot has to know its environment before taking an action, a sensor is required is to perceive the environment and know what things exist. In our case, we use a 2D camera to know where and how the objects and the robots are positioned. We use Mask R-CNN to perform instance segmentation and object detection or use YOLO for object detection.  Mask R-CNN is divided into two modules, first, it estimates the regions where the objects can exist on the input image. Second, based on the initial estimation it identifies the class of the object and generates a mask in the pixel level. In the initial step, the RPN (Residual Pooling Network) scans all FPN (Feature Pyramid Network) in a top-bottom approach and estimates where the objects exist on the input image. Once the estimation is done a bounding box is assigned to the anchor (anchors are a set of boxes with predefined locations). RPN helps in the anchor to decide where in the feature map