Wow OK I just checked that file and it looks great. Please review and see if you get even more ideas
Today’s goal, at the end of class, add YOUR NAME to the same file and three specific project ideas:
They should be specific enough that I can understand them, but don’t over specify too early
They should have some clear path forward – things that look like they can be done
It’s OK if they have parts that you can’t see how to do yet.
They should each be a research idea, or an invention, or a product (hardware, software, etc).
Finally if you had to pick a Sr research direction (including but not limited to: AI/ML, App Dev, game dev, software dev, compilers, OS, computer vision, networking, cybersecurity, really ANYTHING). what would what area be? This is not a commitment, just a survey. Add that below your name.
2/5/2025 (Wednesday)
Research: Take one of your research ideas, and do some background research. Find 3 peer reviewed papers or articles that relate in some way to a piece of the puzzle. It’s fine if the articles are for different ideas – you don’t have to overspecify at this point. For each article, read it carefully and be able to answer: “What relevant question about your project does this article answer?” and “What new question does it raise.”
Create a Google Doc (for now, even though I hate Google Docs) and cite your 3 articles, link directly to them, and answer the two questions. Do this in class today please. Share the Google Doc with me.
Hardware: If you haven’t yet, finish setting up tf-gpu on a laptop you can use. Please crowdsource this until it works.
Cloud: You should probably get a cloud computing account to use if you need some heavy duty computing. You should try any/all of the following
and see what you think: Google Colab (free but slow) AWS Sagemaker Studio Lab, Lightning.ai, Paperspace, lambda labs, something else? IDK if these
are blocked by the firewall or what exactly you have to do to sign up but they all have some kind of free tier. Try them out and see what you think.
I’m hoping to get us some cloud computing accounts somewhere but for now no clue if that will happen.
Bonus activity for AP Stats: If you have a few minutes please complete this form:
Task: Following the example code here, implement transfer learning on an InceptionV3 notework to recognize images from the flower dataset (or any other TF dataset or batch of labeled imaged). Use Colab GPU for this (or a similar service) if you don’t have a personal GPU.
Unmoderated! My alexnet and googLenet notebooks (for reference but have not been cleaned up).
Homework: train a deep RNN model for as long as possible, try to maximize validation accuracy. Be sure to use checkpoint saving callbacks. Use Shakespeare or any other dataset you’d like to imitate.
Here’s a sample of my network running on Colab. It’s a deep RNN network with dropout and an encoding layer. Sample output is at the end. This was trained on an A100.
3/12/2025 (Wednesday)
Finish Shakespeare from last class, share some results
Google Drive mapping
Write a Bach chorale! Get files here. Train your model and then have it output a csv file in the same format as the input
Try this gist to play your files in Colab (or download and play them locally)
3/14/2025 (Friday)
Finish Shakespeare from last class if you haven’t already
Work on Bach
Start with one voice only (e.g. soprano, voice 0) and drop the others
Use an Embedding layer at the start of your RNN
Midi note ranges are approx 40-70 in this dataset. You might want to shift that down before you train (and back up when you test)
Categorical is probably the best way to classify this data, but you could try a regression based loss if you want.
Use the converter above to create midi/wav files from your output
Be careful with array and tensor dimensions. It’s annoying. Sequence input should look like [[10,12,10,15]] not [10,12,10,15]
When you’re ready for four voices, I recommend weaving the data into a 1D stream: like this: SATBSATBSATB where each group of 4 notes is played simultaneously. You can try keeping it as a 4-vector in a 3D tensor but that get complicated.
Here’s sample output from a 3-layer deep GRU I trained. The training data is the first 4 measures or so and then the model takes over.
New topic: Sentiment Analysis. This is a text-to-categorical model. Work through the notebook here.
Assignment: Classify another similar dataset of text data. I recommend the Amazon Review dataset, but you can find your own if you prefer.
For next class: Pick your best Shakespeare and Bach examples to share!
Errata: The original Shakespeare_Student notebook had a critical error in the generation cell. (The later example notebook did not). Please see the fix here
Fun: Here are some of my generated Bach chorales. Each one starts with 64 samples (about 4 measures or 7 seconds) of seed from an unseen Bach chorale.
Side quest: Check out my app here for turning in Java code. Log in with your credentials for class and submit to the Binary Tree assignment. If you need a file to
submit, here’s one that should work
Assignment: Create an auto-encoder for the Fashion MNIST dataset. If you get that working, add noise or dropout to the input images and see how well it reconstructs.
4/10/2025 (Thursday)
Notes on genetic algorithms and symbolic regression