Learnings & Musings on AI, ML, Data Science & Python


Facebook has open sourced their PyTorch based Natural Language Processing modeling framework. According to them it: πŸ‘‡ blurs the boundaries between experimentation and large-scale deployment. Looking forward to trying this out. πŸ€“ Src: Facebook

How Many Words Is This Dataset Worth? 🀯

Google recently released version 4 of the Open Images dataset and it’s quite large. We’re talking a nine followed by six zeroes large and all are labeled and content is boxed and labeled. Happy training! πŸ“¦ Src: Google

Do The Digital Worm πŸ›

Step 1: recreate the brain of the C. elegans worm as a neural network 🧠 Step 2: ask it to park a car πŸš— Researchers digitized the worm brain, the only fully mapped brain we have, with a 12 neuron network. The goal of this exercise was to create a neural network that humans can understand and parse since the … Read More

Fast.DataSets πŸ”£ and AWS have teamed up to make some of the most popular deep learning datasets “available in a single place, using standard formats, on reliable and fast infrastructure.” Woo! πŸ™Œ MNIST, CIFAR, IMDb, Wikitext, and more! Check β€˜em out. Src:

DeepFakes Get More Realistic πŸ˜–

Remember back when I said I was terrified about deepfakes? Well, it’s not getting any better. 😟 Apparently researchers at Carnegie Mellon and Facebook’s Reality Lab decided there is nothing to worry about and the method for making them needed to be better. So they give us Recycle-GAN. ♻️ We introduce a data-driven approach for unsupervised video retargeting that translates … Read More

ELMo Really Does Know His Words πŸ‘Ή

I’m super interested in the world of NLP (natural language processing), so the news that performance increased dramatically with ELMo piqued my interest. πŸ’‘ The biggest benefit in my eyes is that this method doesn’t require labeled data, which means the world of written word is our oyster. 🐚 Yeah, yeah, word embeddings don’t require labeled data either. ELMo can … Read More

(Compute) Size Doesn’t Matter πŸ“ was recently part of a team that set some new speed benchmarks on the ImageNet image recognition data set. Why is this noteworthy? Because they did it on an AWS instance that cost $40 total. πŸ… We entered this competition because we wanted to show that you don’t have to have huge resources to be at the cutting edge … Read More

When Gradients Explode (or Vanish) πŸ’₯

This is a nice quick read on how to combat exploding or vanishing gradients, a problem that wreak havoc on your deep learning model. πŸ‘Ή My TL;DR: Exploding gradient? Use gradient clipping. It sets a ceiling on your gradient but keeps the direction. βœ‚οΈ Vanishing gradient? If you’re using an RNN, use an LSTM. βœ”οΈ Src: Learn.Love.AI.

Fakes Get More Real πŸŽ₯

This is why I’m terrified of DeepFakes! But also, think of the potential for visual artistic mediums like movies and TV. But also, think of the implications for politics. I find it fitting that they used a lot of political figures as examples since this could majorly disrupt the field. πŸ—³οΈ My first concern was in regards to detection, especially … Read More

Demand a Recount: DeepFake Edition πŸ—³

I’m not the only person worried about the potential impact of Deepfakes on politics (not that I claimed to be or thought I was). Apparently there is a Twitter wager about when a DeepFake political video will hit 2 million views before getting debunked. 🐦 I had mostly been thinking about the potential of these tools to be used like … Read More