Machine Learning Experiments

Placing a camera on the front of the mower and using machine learning techniques to determine if it is 'sees' grass is a good project to extend my understandng of this area of technology. I'll start simple and work up to more complex architectures if necessary.

Update 20-10-2021

Now moana is equipped with a camera, Raspberry Pi and a Remote Control interface, it has been possible to collect a set of training images suitable for use in a Supervised Learning environment. I have collected about 4000 images, which although not a massive number, its a good place to start from.

Labelling Data

This is not as onerous as I thought it would be. I collected a set of images of grass. These get put in one directory. Then I collect a set of non-grass images that get put in another directory. As part of the data-preprocessing step, I simply label the images according to the directory the images are in. I wrote a few python utility scripts to move stuff around and rename files, but generally, all was quite straightforward and did not require hours of looking at grass photos.

Since the Pi is setup with WiFi, I pulled back all the images onto my laptop with scp when near the vicinity of the house. All the training and model selection will be done off the mower and then trained models pushed back to moana when I want to test them. I believe I will be spending a while in laptop-land before we have anything useful.

Machine Learning Models

Simple Models

Before going Banzai and jumping into TensorFlow and Deep Learning, I decided to try some 'simple' statistical models on the data I had to see how well they perform on my data set. I wrote a Batch Gradient Descent program and confess to being somewhat disappointed with the results, only getting about 60% accurracy. I then dived into the sklearn library to try a few others. This library is well written and the models follow the same interface so they can easily be swapped in and out for comparison.

Data Preprocessing

For these tests, the images are grey scaled and flattend to 1 dimensional vector for processing. So there is not real 'image recognition' going on here, but we are looking for trends in the data using probabalistic/statistical models.

Model Measurement

I'm using two measures to compare models which are based on False Positives (FP) (I think it is grass, but it is not) and False Negatives (FN) (I think it isn't grass but it is). For a linear model, it is possible to trade one off against the other by moving a decision threshold. In practice, this would mean:

If the false positive rate is too high, I can end up mowing things I should not. Moana could go through a flowerbed, drive off down the footbath or mow a nearby cat.

If the false negative rate is too high, the mower will stop mowing grass when it should be mowing it. At best, this leaves patches of long grass and at worst it just stops mowing.

So false positives at worst are dangerous, false negatives lead to an unreliable mower. If possible, we want to minimize both, with a bias towards a low false positive rate.

Precision and Recall

Precision is the ratio: FP/(FP + TP) an indication of false positive rate.

Recall is the ratio: FN/(FN + TP) an indication of false negative rate.

Here are results from a set of models trained on my 4000 dataset, more or less out-of-the-box. Adjusting hyperparameters may improve the performance of particular models, but for now, this is case of seeing where we stand and getting a 'Ballpark figure'.

LogisticRegression {'precision': 0.6168763102725366, 'recall': 0.556501182033097}
SGDClassifier {'precision': 0.6510791366906474, 'recall': 0.5134751773049645}
Perceptron {'precision': 0.6527303754266212, 'recall': 0.3617021276595745}
RidgeClassifier {'precision': 0.5717192268565615, 'recall': 0.5314420803782506}
KMeans {'precision': 0.48390151515151514, 'recall': 0.48321513002364064}
KNeighborsClassifier {'precision': 0.7909090909090909, 'recall': 0.04113475177304964}
DecisionTreeClassifier {'precision': 0.7025862068965517, 'recall': 0.6936170212765957}
GaussianNB {'precision': 0.6988044735827227, 'recall': 0.8567375886524823}
BaggingClassifier {'precision': 0.7899497487437186, 'recall': 0.7432624113475177}
RandomForestClassifier {'precision': 0.80145388459791, 'recall': 0.8340425531914893}

Simple Model Results

LogisticRegression (Vanilla Batch Gradient Descent), SGDescent gave precision in the range of 60-65% which is not great, and recall is only 50%. This is no better than guessing. Disappointing results I must say.

The Single Perceptron have even poorer results for recall and the RidgeClassifier's regularization is not helping here.

The DecisionTreeClassifier is showing more encouraging results, but it looks like the Ensemble methods GaussianNaiveBayse and the RandomForestClassifier seem to give the best results. These are giving over 80% recall and higher precisions as well. Bcause they are ensemble methods, they may just be eliminating noise in my data.

However, I feel this is a good place to start. The RandomForestClassifier could be used as a candidate for the first grass detector and will enable me to write the framework to use it. I am aiming in excess of 95% for both precision and recall but 80% seems pretty good for a simple algorithm.

update 06-11-2021

I reworked some of the arduino pins I'm using so that I can tidy up some of the wiring. The Raspberry Pi is going is going to require two inputs and one output. Each of the 3 sonars requires two digital pins giving a total of 9 pins so I'm going to use 10 adjacent digital pins from the output towards the front of the mower. I eventually will house all the sonars and the Pi in a single 'Raspberry Barrel' but I need to ensure everything still works for now.

Software

I now have a spare Raspberry Pi (3B+) which I have loaded up with all the necessary libraries. This is a lot easier with Buster as there is GUI package manager. The main libraries I need are numpy, matplotlib, sklearn and opencv. Remember to load the python bindings for opencv! Tidied up the code so we now have a predict, capture and fit project. Also moved some code into a common library (persist and preprocess) so I can use the same code in fet and persist. Found a problem with Mac being ahead of Pi versions which could cause issues. I decided to try training on the Pi to overcome this. It takes 13 minutes to train as opposed to less than a minute on the mac.

Dev pi brought up to standard.
Moana Pi brought to same level as dev pi
software structures tidied up
can now train on devpi
photo_grass.py is being reworked in to predict project so it can be used for training and prediction
Mac still used for jupyter and when things are working, we move to standalone python modules.

Monana - Testing Random Forest Classifier

update 21-11-2021

Today was the first practical ML test. The trained Random Forest Classifier model was loaded onto Moana's Pi, integrated with the arduino, the Radio Control and the remote logging to the laptop via the Xbees. First I tried running on concrete and the deck. The verdict was 'this is not grass', and it worked pretty reliably.

Next I tried running moana on grass. This was rather disappointing. I was not seeing 80% accurracy, the rate of false negatives was very high. I tried on different type of grass and it was more accurrate on some than others, but in all cases, I was nowhere near 80%. I checked the Pi/Arduino interface, and everything was working as expected. Length and color of grass had an impact on the results, but results indicated that the computer modelling I had done did not hold water out in the wild. I could capture more imgages to train with, but I am convinced the single node model is unable to solve the problem.

It is time to investigate Artifical Neural Networks and multi-level perceptron architectures. I will use Keras and Tensorflow to implement this so need to look into getting this technology to run on the Pi. Training times will become much longer but I'll worry about that problem only if it becomes excessive.

Although my initial approach has proved disappointing, I have learned a great deal on this journey and feel justified in moving to a full neural network now. I will take the opportunity to rewire and restructure the sensor tower and replace it with the Rasberry Barrel. Quite a bit of hardware and software changes are required going forward.

November 2021

Home

Contents

Start