Interesting Content in AI, Software, Business, and Tech- 03/06/2024 [Updates]
Content to help you keep up with Machine Learning, Deep Learning, Data Science, Software Engineering, Finance, Business, and more
Hey, it’s Devansh 👋👋
In issues of Updates, I will share interesting content I came across. While the focus will be on AI and Tech, the ideas might range from business, philosophy, ethics, and much more. The goal is to share interesting content with y’all so that you can get a peek behind the scenes into my research process.
I put a lot of effort into creating work that is informative, useful, and independent from undue influence. If you’d like to support my writing, consider becoming a premium subscriber to my sister publication Tech Made Simple to support my crippling chocolate milk addiction. Use the button below for a lifetime 50% discount (5 USD/month, or 50 USD/year).
A lot of people reach out to me for reading recommendations. I figured I’d start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won’t always be the most recent publications- just the ones I’m paying attention to this week. Without further ado, here are interesting readings/viewings for 03/06/2024. If you missed last week’s readings, you can find it here.
Reminder- We started an AI Made Simple Subreddit. Come join us over here- https://www.reddit.com/r/AIMadeSimple/. If you’d like to stay on top of community events and updates, join the discord for our cult here: https://discord.com/invite/EgrVtXSjYf.
Community Spotlight: Rubén Domínguez Ibar
Ruben writes the VC Corner newsletter, where he uses his insights as a top Investor to share the most interesting developments of the week. His work is super useful if you're looking to see the business side of things. More than ever, it's important to see things from various angles, and Ruben's work will help you in that regard.
If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.
Previews
Curious about what articles I’m working on? Here are the previews for the next planned articles-
How I did quality checks for the newsletter when I was just starting my writing journey [Saturday]
RWKV
For personal reasons, I didn't do a lot of reading over the past week. This week's updates will be relatively small.
Highly Recommended
These are pieces that I feel are particularly well done. If you don’t have much time, make sure you at least catch these works.
Matryoshka Representation Learning: Flexible Embeddings
A fantastic overview of the above idea by vaibhav nakrani. I have been very impressed with his LinkedIn posts recently, so make sure you check them out if you're looking for shorter-form content on Deep Learning and AI.
The problem:
↳ The indexing cost of embeddings scales linearly with increase in dimension size and size of the database.
↳ Can we create a versatile way of representing data that works well for different tasks, even when those tasks have different levels of available computing power?
The MRL Paper addresses this question heads on. I think it is quite a novel way to compute embeddings.
[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles
Yannic Kilcher's ML News is one of my favorite sources of information in AI. He is level-headed, funny, and very knowledgeable (he is also one of the 3 people b/c of whom I got into writing). I also appreciate that he mixes technical, social, and business news into the segments - so things never get repetitive. His channel is a must-watch for those who want to get into technical details of AI.
From Controversy to Dialogue - Beginner Level
In an increasingly screamy world where people love to talk past each other, I don't think I need to sell the importance of this work. Also can be applied to ourselves for better introspection.
In this part of my introduction to Boghossian’s and Lindsay’s How to Have Impossible Conversations, I'll discuss the first level of competency in challenging your conversation partner's beliefs in hopes that they'll consider questioning them.
The authors call this intervening in someone’s cognitions and giving them the gift of the doubt.
Eastern & Western Design: How Culture Rewires The Brain
Culture completely changes the way our brains process information. And because of that, various cultures design things very differently.
One of my motivations in exploring the limitations of data was to challenge the notion that some AI Folk have that with enough high-quality data, anything is possible. This is a pretty cool example of how Data tends to offer some many limitations. Tl;DR- Dead Fish can recognize human emotions.
A study published this year in Perspectives on Psychological Science noted that many papers in social neuroscience, the field that examines the neurobiology of social behavior, suffered from faulty analyses that produced "voodoo correlations" in their data. Separately, an analysis by researchers at the National Institute of Mental Health found that almost half of the neuroimaging studies published in well-known journals, including Nature and Science, contained unintentional biases capable of distorting conclusions...
Not in the least. The researchers used the defunct salmon to make a salient point. When statistical tests are made of the 130,000 tiny three-dimensional pictures (voxels) that make up an MRI image, some of them are bound to give "false positive" results purely by chance: in this case, a cluster of pixels just happened to illuminate in the region of the dead salmon's brain.
The Dangerous Wild West of Online Gambling
A super interesting business case study of how various institutional players are trying to win at the online gambling market, by the always excellent Modern MBA.
If you’ve watched any American sports game recently, there’s a good chance you were blasted with an ad from FanDuel, DraftKings, BetMGM, WynnBET, Caesars Online, and various online casinos promising free money and million dollar jackpots upon signup.
Historically, sports betting and gambling in the US had been heavily restricted where you could only legally play offline at isolated places around the country like Las Vegas and Atlantic City. But in the past 5 years, the landscape has evolved dramatically as venture capitalists have invested billions into mobile-first gambling startups like FanDuel and DraftKings.
But despite the relentless advertisements and celebrity endorsements, online gambling and sports betting is nothing more than a wild west where no one is telling the truth. It’s a market so new that everyones boasts that they have the greatest market share, the best apps, the most users, and the fastest growth. In this episode, we’ll unmask the wild wasteland of online gambling in the US, the economics of traditional casinos, the political dynamics behind this market, and what FanDuel and DraftKings are really betting on as they pursue adoption at every turn.
When Everything's A Crisis... Nothing Is
Darin Soat is one of the best at making relevant finance/economics videos. This one on financial crisis fatigue and how it hits the most vulnerable people is a must-watch.
New entrants into the job market are struggling to get into roles that were desperate for workers just three years ago. If people DO lose their jobs, they have record low savings to support themselves. So then, why are people spending more than ever? The financial situation of the average American HAS gotten worse in the past twelve months.
According to data from Equifax and the New York Fed money that peoples saved during the pandemic is gone and, in its place, people are using credit cards and short term loans to cover expenses. Buying or renting a house is more unaffordable than every due to a trio of low availability, high mortgage rates, and high prices AND with an election on the horizon people are worried that things will only get worse when appeasing voters is no longer top priority.
But when everything is a crisis, nothing is… People have seen the headlines, and they just don’t care anymore. Crisis fatigue is a real problem and it’s costing you thousands of dollars every year (whether you realize it or not) for three different reasons.
Top 8 leaderboards to choose the right AI model for your task
The space of generative AI is moving fast across modalities. From language and text to image and video. In 2023 alone, we saw the state-of-the-art (SOTA) language model, GPT, grow from a context window of 4k to 128k, with a remarkable boost in performance: +16% and +18% on MMLU and HumanEval, respectively.
...This post, the fourth in a series, aims to guide developers and researchers in choosing the right language model for their tasks, ensuring they launch accurate, efficient, and cost-effective LLM applications.
AI Content
GraphRAG: Unlocking LLM discovery on narrative private data
Had a whole post about Graphs and how they can improve RAG. Microsoft agrees. As they say, great minds think alike.
Retrieval-Augmented Generation (RAG) is a technique to search for information based on a user query and provide the results as reference for an AI answer to be generated. This technique is an important part of most LLM-based tools and the majority of RAG approaches use vector similarity as the search technique. GraphRAG uses LLM-generated knowledge graphs to provide substantial improvements in question-and-answer performance when conducting document analysis of complex information. This builds upon our recent research, which points to the power of prompt augmentation when performing discovery on private datasets. Here, we define private dataset as data that the LLM is not trained on and has never seen before, such as an enterprise’s proprietary research, business documents, or communications. Baseline RAG1 was created to help solve this problem, but we observe situations where baseline RAG performs very poorly. For example:
Baseline RAG struggles to connect the dots. This happens when answering a question requires traversing disparate pieces of information through their shared attributes in order to provide new synthesized insights.
Baseline RAG performs poorly when being asked to holistically understand summarized semantic concepts over large data collections or even singular large documents.
To address this, the tech community is working to develop methods that extend and enhance RAG (e.g., LlamaIndex(opens in new tab)). Microsoft Research’s new approach, GraphRAG, uses the LLM to create a knowledge graph based on the private dataset. This graph is then used alongside graph machine learning to perform prompt augmentation at query time. GraphRAG shows substantial improvement in answering the two classes of questions described above, demonstrating intelligence or mastery that outperforms other approaches previously applied to private datasets.
Elon Musk sues OpenAI ⚖️, Rethinking the startup MVP 🔄, The Future of Digital Health 🩺, & more!
This thread by Logan is a simple overview of why engineering on Alignment and having them behave the way you want them is such a pain.
If you liked this article and wish to share it, please refer to the following guidelines.
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- https://artificialintelligencemadesimple.substack.com/
My grandma’s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
I have always been skeptical of online gambling platforms. How must one know that the algorithm is not rigged?
They employ cameras and security on blackjack tables to see if someone is not card counting. How do they deal with that online.
I have heard so many cases of people losing money gambling online. Indeed a dangerous Wild West.
Thanks for the inclusion Devansh! Everyone likes to critique all the shiny new LLMs and no one likes to actually think about what makes them go wrong 🤔