Learnings & Musings on AI, ML, Data Science & Python

US Government Opens the Kimono πŸ”“

Merry Christmas data nerds! 🎁 Congress has passed the OPEN Government Data Act, which should mean a bunch of new, shiny data to play with. And in a sector that could definitely benefit from what data analysis and machine learning could bring to bear. πŸ“œ Find more datasets here. Src: E Pluribus Unum

Data Sets Get LawyeredπŸ‘©β€βš–οΈ

The LawyerBot 3000 might soon be a reality thanks to Harvard. They have digitized over 6 million cases to aid in the development of AI systems for the legal sector. So fire up your NLP and get ready to object! βš–οΈ Src: Caselaw Access Project

How Many Words Is This Dataset Worth? 🀯

Google recently released version 4 of the Open Images dataset and it’s quite large. We’re talking a nine followed by six zeroes large and all are labeled and content is boxed and labeled. Happy training! πŸ“¦ Src: Google

Fast.DataSets πŸ”£ and AWS have teamed up to make some of the most popular deep learning datasets “available in a single place, using standard formats, on reliable and fast infrastructure.” Woo! πŸ™Œ MNIST, CIFAR, IMDb, Wikitext, and more! Check β€˜em out. Src:

The Economist Opens Their Data πŸ‘

The Economist has announced they will be open sourcing their data, starting with the Big Mac Index. πŸ” It also sounds like they’ll be providing a glimpse into their process via Jupyter notebooks, which is really cool. πŸ““ Src: The Economist

Lazy Faire πŸ‡ΊπŸ‡ΈπŸ‡¨πŸ‡³

At the US government’s current rate of uninvolvement in the AI sector, China will overtake it in its quest for AI overlord status by the end of the year. At least when it comes to spending, the rest might not be far behind though. πŸ’° One of the recommendations from a subcommittee is to expedite the approval of the OPEN … Read More

Dataset Database πŸ—„

What does ML want? Data! When does it want it? All the time! But specifically, whenever you are going to train, test, and deploy a model. Where do you get this data? I’m glad you asked! πŸ˜ƒ Here is a collection of datasets I’ve come across. I’ll update it as I find more. βž• Computer Vision Open Images V4 from … Read More

Unbiased Faces πŸ‘ΆπŸ»πŸ‘©πŸ½πŸ‘΄πŸΏ

IBM will be releasing a data set of faces across all ethnicities, genders, and ages to both avoid bias in future facial recognition systems and test existing systems for bias. Simply put, this is awesome. πŸ™Œ It’s also interesting to see how ethics, fairness, and openness are being used as positive differentiators by major competitors in this new tech race. … Read More