Machine learning is everywhere; from the cars that people drive to the devices in our pockets called phones. Being constantly exposed to the idea of machines having the capability to learn forced me to ponder about it often. For some reason I could not wrap my head around the logistics or how it was even possible. To learn more about it and the implications, I decided to take up a project: a gesture-controlled drone. The premise of this project was to create a drone that would be able to respond to hand gestures and perform actions. To go into more detail, the drone would see a hand gesture through its camera and it would respond through a series of programmable actions.
Bringing this project to fruition would not be possible through regular code, as traditional code would require the image the drone captures to be precisely the same as the picture it expects, pixel to pixel. This is not a feasible way of working as we would want the greatest functionality from the drone where it can perform actions in many different backgrounds with many different shapes and sizes of hands. The drone would therefore need to be trained to understand what it is perceiving and that is where artificial intelligence comes in.
With many bumps and hurdles along the way, making this drone was certainly no easy task. At the very emergence, I formulated the plan to use a glove as a way to mask the hand from the rest of the background. This way the drone would be able to better identify the hand and classify the gesture. In order to train the drone to understand and learn, I had to create a dataset. A dataset is often a group of photos that are used to train the machine learning model. A dataset must have classifications or categories that can be created by separating the dataset into folders. These classifications would be the different hand signals I am holding up. Each classification would hold a set of images pertaining to the action I would like the drone to perform. Since I wanted 6 gestures, there were 6 classifications. Further complicating this was the need for test, train, and validation folders. The train folder would contain all of the images I would like to train the machine learning model. The test images would test the accuracy of the model. Finally, the validation images would be used for fine tuning this model. Once I had my dataset formatted into many different folders, I was now able to create a model.
Testing different neural networks to train this model was the best part of this project, as I could see the accuracy of each model increase. What started with a 33% accuracy rate soon turned into 45%, which later turned into 88%. By the time my testing stopped, my accuracy rested at about 99%. I decided to go with that model for the drone. Through my testing I learned about how different variables played a significant role in the accuracies. Some of these variables included whether I used RGB pixels, grayscale, the amount of layers in my neural network, and the number of epochs.
Everything came crashing down when I realized that for some strange reason my model was not properly predicting the categories when I tested it. The problem was that the resolution used to make the dataset did not match the resolution of the camera. The resolution on the drone compressed the image making all of the images seem similar. Such events are a common theme in artificial intelligence as it revolves around trial and error.
I decided to take a different path that would still have the same foundation but a more cautious plan of attack. I learned my mistake and decided to be more vigilant to any potential errors that could occur. I devised the strategy to use coordinate points as a means of training the drone. Instead of using images, this time I used the coordinates of points on my hand to train the model. This path turned out to be much more efficient and simpler. Going through the steps of training the model and programming the drone to perform actions based on predictions from the drone, I now have a working gesture-controlled drone. I am grateful for all of these errors and hours of revamping my project as they have taught me a tremendous amount about artificial intelligence and machine learning.