Artificial Neural Networks

After experimenting with several statistical model in scikit-learn, I came to the conclusion that for grass detection, they were not goint to cut it and I needed to move onto Artificial Neural Networks. I have experimented with tensorflow and cv2 on my laptop so need to apply some of this learning to the moana project.

Making models compatible between my laptop and the Pi turns out to be quite a headache so I decided to try (initially at least), to build everything on the Pi. This would mean training as well which may not be too painful for simpler models. If this becomes too onorous, I will try training on a laptop or on google colab and tackle the transfer later. Keep it simple initially!

Configuring the Pi

Getting compatible versions of cv2 and tensorflow running on a raspberry Pi 3B turned out to be a house of pain. After an awful lot of effort, I managed to get this working on maui, running under a virtual environment.

Activating the virtual environment

This is to help remember where I put things. I will do some refactoring later to tidy up locations and my git repo.

pi@maui:~ $ source ~/Documents/grass/grassenv/bin/activate
(grassenv) pi@maui:~ $

Versions

>>> import cv2
>>> cv2.__version__
'4.4.0'
>>> import tensorflow
>>> tensorflow.__version__
'2.5.0'

Pre-Processing Data

I have collected about 5000 grass and not-grass images in two directories as 64x64 bit images. I decided to further reduce this to 32x32 images stored in two further directories.

pi@maui:~/git/projects/python/moana_grass/grass_capture/util/resize_images.py

Labelling Data

This was discussed previously. This was done in the preprocessing library I have written.

pi@maui:~/git/projects/python/moana_grass/grass_capture/pi/common_lib/preprocess.py

Machine Learning Models

I'm going to experiment with several tensorflow ML models. All of these will be trained on my image dataset and I'll measure their performance and may need to collect more images if there is insufficent data to train the models accurately.

All of these models are going to require some data pre-processing prior to training:

Convert 64x64 color images to 32x32 colour images.
Convert 32x32 bit colour images to 32x32 greyscale images.
Label the data (grass=1, not-grass=0).
Read the image/label data into numpy arrays for processing.
Split into training and validation datasets.
Pass numpy arrays into various models for training.

Note that similar pre-processing steps will need undertaking when in prediction mode, based on live camera data.

Single layer with 1024 nodes (tf_model1)

Since the output is binary (grass or not-grass), I will use the binary-cross-entropy loss-function in all the models and flatten the input in a similar manner to that used for scikit-learn models. I experimented with 3 fully connected models, a single layer, a 2 layer and a 3 layer variant.

Single layer Model definition

  def create_model_1024():
          model = keras.models.Sequential()
          model.add(keras.layers.Flatten(input_shape=[32,32]))
          model.add(keras.layers.Dense(1024, activation='relu'))
          model.add(keras.layers.Dense(1, activation='sigmoid'))
          return model

Two layer Model definition

  def create_model_512_300():
          model = keras.models.Sequential()
          model.add(keras.layers.Flatten(input_shape=[32,32]))
          model.add(keras.layers.Dense(512, activation='relu'))
          model.add(keras.layers.Dense(300, activation='relu'))
          model.add(keras.layers.Dense(1, activation='sigmoid'))
          return model

Three layer Model definition

  def create_model_200_300_200():
          model = keras.models.Sequential()
          model.add(keras.layers.Flatten(input_shape=[32,32]))
          model.add(keras.layers.Dense(200, activation='relu'))
          model.add(keras.layers.Dense(300, activation='relu'))
          model.add(keras.layers.Dense(200, activation='relu'))
          model.add(keras.layers.Dense(1, activation='sigmoid'))
          return model

I experimented with various optimizers and settled on the Adm with a learning rate of 0.00001 and trained the dataset with 1000 epochs. I used 10% of the image dataset for validation.

Training Accurracy

Below is a table summarizing the computed accurracy of training. All models convered to 1 which implies I'm overfitting the data butgenerally, it seems 80% validation accurracy is the result of all 3 models. The 3 layer model trained faster than the two others.

model	training accuracy	validation accurracy
model_1024	0.99	0.80
model_512_300	1.00	0.79
model_200_300_200	1.00	0.81

From these results, I feel 80% is likely to be the best I can get from these simple fully connected models with the data set I have. I am going to have a stab at augmenting the dataset to double its size and I will re-run these tests.

October 2023

Home

Contents

Start