Hobo Data, Student Version¶

In [1]:

import pandas as pd
import matplotlib.pyplot as plt
import scipy
import sklearn as sk
import cv2
import numpy as np

We'll have three groups working on different parts of the HOBO data (described below).

Group 1 will extract sensor coordinates from a satellite image screenshot. You will produce a funtion that returns screen cooriadte for x,y pairs as in:
- pos("A","30") = 143,299
Group two will ingest and clean the all the relevant data and store it in arrays indexed by <x,y,t> (horizontal, vertical, time, so that a user can query
- "A, 30, 01/27/2022 17:28:08" and return the reading -0.95 34.40
Group 3 will explor 3D plots in matplot lib specifically contour and surface plots, and then animate the plots as parameters change. You can start with variants of $y = e^{-x^2} \sin(t)$ and $z =4 \sin (t) e^{-x^2-y^2} \cos \left(4 x^2+4 y^2\right)$

Part 1: Extracting Coordinates¶

There's some data being collected here Read about the hobo data first. Then identify the data for the "red" grid of sensors for ACL. in part 1 you will approximate the coordinates of these sensors by analyzing a screenshot. You're using the openCV library "cv2" which should be installed in this environment.

In [2]:

pic = cv2.imread("red_sensor_image.png", cv2.IMREAD_COLOR)

Find the shape and then select a submatrix that is square but doesn't cut off any data

In [3]:

pic.shape

Out[3]:

(1180, 1235, 3)

In [4]:

pic = pic[:,0:1180,:]
pic.shape

Out[4]:

(1180, 1180, 3)

Convert the image from BGR to RGB (openCV uses a different ordering of color data) and then display. Use the commands cv2.cvtColor and plt.imshow()

To find the red dots, we choose to analyse just the red layer of pixel information. Split the image into RGB channels using cv2.split and dislay the red channel only, as grayscale.

You see the red pixels are nearly white and the background is nearly black. We want to crush this image into 0s and 255s for black and white, so the red dots really pop. Use matplotlib filtering syntax (much like pandas filtering). Any pixel with a red above 200 should become 255. Anything below should be 0. (Be sure to copy your red matrix first in case you mess up.)

In [ ]:

## print the red matrix here as a sanity check
# and its size

In [ ]:

Plot your black and white matrix here

In [ ]:

Machine Learning -- KMeans¶

We need to know the location of the centers of these dots, in the coordinate space of the picture. the kMeans algorithm is perfect for this. It finds local centers of clusters. I'll get you started with a list of all the pixels that have white centers. The KMeans algorithm will determine all the centers and list them for you as coorindate pairs. You should tell it how many centers you're looking for. Then create a plot with (a) the original image and (b) the centers plotted as white 'x' symbols (look at plt.scatter(marker = ...))

In [20]:

from sklearn.cluster import KMeans

# Extract the coordinates of the white pixels (where red == 255)
white_pixels = np.column_stack(np.where(red2 == 255))
white_pixels

Out[20]:

array([[  30,  828],
       [  30,  829],
       [  30,  830],
       ...,
       [1162,  192],
       [1162,  193],
       [1162,  194]])

In [ ]:

# Apply KMeans to find the centers of the white dots

Part 2: Ingesting Data¶

Use pandas to read spreadsheet csv files into data frames. You will probably want to merge frames to get one big one. Then think about how to filter or aggregate the data so it's ready to go when someone asks for a location and time -- you should give it to them. (Assume their input is valid at first, but then handle the case when it isn't)

Part 3: Visualization¶

Read up on matplotlib 3d plotting. There are four different things here

3d plots and contour plots (related)
animation
animation with 3d plots
saving the animation as a file

Thare are plenty of tutorials on saving a simple 2d animation. I'd do that first. Then play around with 3d contour and surface plots. Finally try to get an exported video of a 3d animation. (you might need ffmpeg, it's easy to install on the unix side of WSL)

The goal here is to be able to take grids of data (x,y,t,z) corresponding to location and time and an output like temperature, and plot that as a 3d surface and/or contour and then animate it for a range of times.