An Entrepreneur’s Perspective – Venture Investing in AI

October 26, 2023


We recently hosted our second annual Ideas & Networking Conference in New York City on September 21st.  It was a fantastic day with great speakers and participants.   I specifically want to thank our special guest speakers, Jim Waskovich, Shaun Johnson, Elisabeth DeMarse and Jeff Gramm, for sharing their ideas and insights.  Also, thanks to our many attendees!

As a follow-up to that event, I am excited to share the edited excerpts from my fireside chat with Shaun Johnson.  Shaun is the General Partner of AIX Ventures, an AI-focused venture fund.  Shaun discusses the rise of large language models and generative AI, talks about the future potential of AI, digs into some of his latest venture investments, and focuses on the importance of NVIDIA’s GPU technology.  I am confident you will find value in this interview.

In the coming months, I anticipate sharing additional excerpts from my chats with the other conference participants.  Enjoy!

Best Regards,

William C. Martin

Topics in this Issue of An Entrepreneur’s Perspective:


Interview with Shaun Johnson: Venture Investing in AI

Welcome!  Can you tell us a bit about your experiences leading up to starting AIX?

I started at the University of Texas studying Electrical Engineering. I went to Stanford for more engineering and then spent time as an operator building deep-tech product hardware at the transistor level and software at the firmware level.  Then some would say I went to the dark side by switching from Engineering at Stanford to Business at Berkeley.

After that, I spent some time in venture capital, then started leading engineering teams at various Sequoia-backed companies. The last company, Lilt, is building an AI-powered localization solution for large governments and enterprises. I found myself in the thick of AI and that started the journey to AIX.

What made you move from being an enterprise practitioner to being an allocator?

Scale. I always thought about how I’m using my time and thinking about how do I help as many CEOs as possible get to product market fit, run engineering, manage product and design, and run their entire firm?  Venture is a really great way to scale. I work with our founders where and when they need the most support.

How did you go about building the team?

When I was at Berkeley, I thought a lot about starting a venture firm and I thought it was just a terrible idea. It seemed like everybody was thinking, “Oh, I should start a venture firm. Great idea.” So, I concluded, I should go back out and spend more time in the trenches, more time helping CEOs. All along, though, I kept in my mind the pattern recognition of what’s needed to start a venture firm.

In 2019, when I was working at Lilt, I saw what we were doing, what our teams were doing with models, what they were doing with on-the-fly training and how all of this was driving a metric we called “word prediction accuracy”. This technology was creating a ton of value for our customers in terms of turnaround time, price per word, and other metrics you care about in the localization space.

I started thinking that I should reconnect with Pieter Abbeel. Pieter and I were at Stanford together 23 years ago and Pieter has had wild success. He’s one of the top AI people in the world. He started two companies; his latest is called Covariant, which was recently valued at $700 million post-money. He’s U.C. Berkeley’s Professor of AI and Robotics and he’s won multiple awards for his contributions to deep learning and diffusion models. We started talking about AI and everything that he was doing around advising and starting companies.

I realized that Peter had access to the best founders. My thought was, there has to be a venture firm here, right? Why isn’t there a firm? At the time, and it’s still true today, if you go to a firm that’s investing in AI and ask, “Where’s your AI experience?” It’s usually a terrible answer, right? They say, “Oh, a partner worked in AI like five years ago,” or “We hired someone from undergrad that knows something,” or “There’s a scientific advisor we can talk to every 30 days.” It’s not a great answer.

Our idea was to build a unique type of firm where some of the world’s top practitioners are heavily engaged in sourcing and diligence, in decision-making and support. That’s AIX. I pitched Pieter this idea. He responded, “Great, you should meet Richard Socher.” Richard’s the person who first took natural language processing and deep learning in 2013 and put them together. If you look at the GPT papers, they reference Richard’s work.

Richard got it right away as it’s something he wanted to start for a while.  He connected with Christopher Manning and Anthony Goldbloom, and AIX was born.  These amazing folks bring all of their deal flow exclusively to AIX and engage heavily in diligence, decision making, and founder support. So that’s our cheat code for access to amazing AI founders.

I also met Chris Manning from Stanford’s AI Lab, who is on your team.

Yes, we brought together Richard, Pieter, Chris Manning and Anthony Goldbloom. Chris is the world’s most cited NLP (natural language processing) researcher at Stanford and Anthony Goldbloom founded Kaggle, the largest machine learning community in the world. These individuals have communities all around them and we amplify that. Richard has an 80-acre ranch, 20 minutes west of Stanford. Come visit next year.  We host hundreds of founders, it’s an amazing ecosystem. We use that as our flywheel for deal flow.

Organizationally, how does it work if your partners have day jobs?

Yes, they’re academics and entrepreneurs.  Sometimes they’re both.  They are all first call investors. When amazing AI folks have an idea, they call Pieter and they’ll say, “Hey, I want to bounce this idea off of you.” And Pieter says, “Great.” They usually meet at his house or somewhere out in the wild and Pieter gets a sense of the idea. When the founders are ready for their first financing, he can make an introduction to myself or someone on my team.  We start the diligence process with Pieter. We circulate a memo to the entire team, usually asynchronously through Slack, for example. Every week we have our investment committee, which everyone shows up to. It’s a very synchronous, asynchronous, new age process where we’re considering top people in the space who often have an insight to disrupt an entire industry.

Diving into AI specifically, tell us what is a large language model? How does it work? Why is it important?

What is a large language model? It’s a neural network that’s trained on data and it’s modeled after the brain. That’s why it’s called a neural network. It has digital neurons that work and are modeled after the way our brains work. Neurons, as you’ve probably learned sometime in your career path, have signals coming in and then the neuron will either fire or won’t fire. That’s very non-linear, right? You have all these signals of different varying levels that come into the neuron and then you get a signal out or you don’t. And there are many of them. Somehow when I ask you a question, you’re able to process everything I’m saying, it’s phenomenal. It’s because of all the interconnections of these neurons. That’s exactly how a neural network works.  All of us here have been trained by the educational system, our parents’ data supervised learning, unsupervised learning, reinforcement learning. That’s how machines are trained. We’re all parents here, whether or not you like it, to machines.

Machines are at a place where it’s not quite human intelligence. It’s a place where they can now understand what we say in all the nuances. English is a nuanced language, yet the machine can interpret your intention and generate creative output. It can finish your thought or respond to a prompt. It can create new images. It can create videos. It’s in its infancy. When you’re able to do things like understand human language and do creative output, you have many enterprise workers where everything’s going to change. When I look at the tasks I’m doing, many times, I’m like, “Why isn’t there an AI doing this?” And very quickly there will be.

I’ve heard this technology described as a “sequence prediction model.”  You train it on enough information, and it knows what comes next. Is there creativity there? Is there entropy?

Interesting question. If you think about the human facility, I mean, our neurons are trained. We’re hooked up a certain way and we’re able to do things that are quite mundane. We’re able to do things that are quite creative. Machines are the same, right? They can do repetitive tasks. It can be quite creative. I think what we’re sorting out now is how much data, the quality of the data, and the architecture.

AGI (Artificial General Intelligence) isn’t going to be realized through the transformer architecture. AGI is the idea that machines will have consciousness and be on par with a human and potentially eclipse human intelligence. We’re not there yet. People might speculate, but there’s no concrete proof that machines have consciousness.

Getting to AGI, is that going to be based off an LLM model getting bigger and better, or are there a different approaches or different flavors?

I think the consensus is there’s going to be a different architecture. It must evolve past the LLM and past the transformer.

To make these models better, is it just a function of more data?  Or is it more NVIDIA GPUs, or more specialized data? 

When you say “make them better,” I think the question is, “for who?” and “what’s the context?”  As venture investors, we are really focused on the user experience. If you’re a user in the enterprise, let’s say you’re doing an audit of another firm, how can you most productively and most efficiently use AI to get your work done? It doesn’t really matter if there’s one model or ten models. What matters is that your work is done effectively, productively. In that case, we’re not really debating whether there is one model winner or not.

So you don’t think the owner of the model accrues all of the value per se?

It’s going to be entirely user-interface based and there’s probably going to be multiple models in the background. There’s probably a foundation model to manage the long tail of use cases, but there’s also likely a niche model which has many fewer neurons that is trained on a specific corpus of data for a particular application to handle, let’s say, 1 or 2 sigma of the use cases.

Facebook has open-sourced a robust LLM model. What do you think about their strategy and building business models on top of that?

Open-source models have various licensing models, but I think Hugging Face is the preeminent open-source platform.  In full disclosure, we are Hugging Face investors.

That’s one of your investments? Tell everyone what that company does as they’ve seemingly raised a gazillion dollars.

There are two ideas: open-source and closed source. Open-source is when everybody can view the code, view the product.  Closed sources, nobody can view the code or view the product. You access it through different closed ways, ways that you paid for potentially. What Clem at Hugging Face has done is an interesting story. Back in 2016/2017, the team were students of my co-founder. My co-founder was a professor at Stanford. He was teaching these really smart Parisians.

Richard took notice and invested $5 million. Have you seen the movie “Her?” It’s basically this idea that you could have an AI to talk to you and have a relationship with you. That was their vision. What they realized is their NLP models were getting so good that they published them. They did an experiment where they published it and they saw such high engagement on the model. Other LLM engineers started looking at it, commenting on it, wanting to have calls with the team on it. They thought, “We should do this. Let’s start publishing our models to the open world.” Now, six years later, they’re valued not at $5 million, but $5 billion. They’re the largest open source community, period. Companies contributing and people posting their models

A central clearinghouse for open-source AI models?

Exactly. Companies contributing to the repository include Microsoft, government entities, Google, Meta, you could go on and on. That’s a business with really strong network effects. Now the question at the back of your mind is probably, if the model is open source, how do you monetize? They monetize through inference or making those models available for production usage. Say you have an application in mind. Maybe it’s a computer vision application. You go to Hugging Face, you find your model, it’s potentially pre-trained, but you know you don’t want to host it on a server, that all takes work, infrastructure takes work. Hugging Face does all that for you and they’ll just charge you per inference or per usage for their model.

Fascinating. Can you talk about a recent investment or two and what they’re working on and how they’re applying the technology?

Yes, there are so many exciting investments. There’s a company called Athelas. Tanay Tandon is the CEO. His insight was that you could use a simple convolution neural net to take a blood sample and count red and white blood cells. This is being done in big labs today, Labcorp, Quest, etc. Why? Tanay, did that one simple thing. He just wanted to count red and white blood cells. Years later, it is FDA-approved with a $1.5 billion valuation. We were the first check in. That’s a fantastic story. They have a little device at home that you can use to draw a sample and prevent you from going into the lab to count red and white blood cells. If you’re a cancer patient, for example, and it’s COVID, you don’t want to go to the hospital, you’re immuno-compromised. You can imagine COVID was quite the tailwind for them.

Is that just optical imaging? How does it work?

Yes, you take a picture, send it to the cloud and then the image is applied to a neural net. It just counts. On the news today, you may have sees a photo or a video and there are boxes around it, labeling what it is. You’re labeling red or white, you’re summing them together and making a recommendation, essentially. That’s one company that we think is really interesting.

Other companies, we mentioned Hugging Face, we think there’s three places in the stack. To clarify, AI on the B2B side, which is where we focus, we’re not as consumer focused, we’re more enterprise focused. We think about the application layer, the infrastructure layer, and the foundational layer. You’ve probably heard a lot about the foundation layer. These are companies that are raising $100 million at a billion valuation without really having done anything. We don’t invest there. That’s not our business to invest in $1 billion valuations.

Those are companies building the raw models?

Yes, they’re building the raw models. The reason that it’s so expensive is compute is expensive, accessing the GPUs to do the training is expensive, and the return profiles are way outside of early stage, so we don’t play there.

We play in application and infrastructure. Infrastructure is this idea that machine learning engineers need tools to get models in production, to monitor models in production, to optimize models in production. This is the nitty gritty.

At the application layer, it’s everything from what I alluded to earlier. How do we think about auditing? How do you think about accounting? How do you think about finance? How do you think about sales?  A salesperson today, if they’re at the top of the stack and they have a leads list, a prospect list, what do they do next? Often, they compose an email. The idea of that email is to maximize response rate. You want that email to be highly tailored, highly specific, highly resonant with the receiving party. It sounds great for AI, right? AI could very quickly segment customers and pull data. Maybe through a technology called retrieval augmented generation the AI could compose an email that the salesperson could slightly modify before they send out, maximize that response rate and minimize the time it took to compose.

I think my son’s interested in that for his homework.

Well, yes, I think everybody is. That’s the other side of it. I was talking to my dad five years ago and he was saying something like, “I saw a person on TV and asserted that they are real.” I responded, “Well, how do you know it’s true?” Was there a digital stamp? How do you know that person exists? I think more and more online, what we’re going to start to see is influencers are probably not real. They’re probably digital creations. That’s another part of it, how do we detect true from false? Which is, of course, a larger narrative in our world today.

Guest Q&A: How do you think AI is going to ultimately impact application software? I think it could lower the moat and might increase the total amount of code. Observability might have good tailwinds. Also, with these copilots, how do you think they’ll impact it? Will they become commoditized or what are the things you look for in that?

Yes, there’s a new way of thinking about software. Andre Karpathy gives good talks on this subject. He used to be the head of AI at Tesla. He would show, here’s the Tesla code base and here’s the percentage of it that’s pure AI and ML, and here’s the percent that’s procedural code. Which all of us are very familiar with. You write “if then” statements and it’s very systematic and deterministic. Developers are moving more and more to coding with data. Where you have some corpus of data around human behavior, what people are trying to do. Then you train a model and the model can do that.

But I want to differentiate that because you dived into code. Maybe there’s not even a lot of deterministic code anymore. Maybe there are models driving user interfaces that have a human in the loop like us that say, “you’re on the path or off the path.” The reality all of us know today is when you use a ChatGPT and you input something and it comes out and you think, “Oh, that’s dumb.” Then you change it or correct it. It’s learning from you. You are the human in the loop. It’s all very deliberate. You’re the parent today providing feedback and it’s getting better on your cognitive dime, if you will. I think broadly what you should expect is high degrees of automation moving from the simplest tasks to the more higher IQ tasks over time. At the high level, we’re just on the path to AGI. I think the world is trying to figure out what that means.

Guest Q&A: How do you think AI should be regulated going forward? Do you worry about either bad actors or artificial intelligence itself taking over at some point. How should governments be trying to regulate this?  Could these models “escape into the wild?”

Yes, there’s so many viewpoints from so many smart people on this.  I think the most important thing is alignment. The definition of alignment is the degree to which the goals of humans are aligned with the goals of machines or vice versa. The idea I look for is when any company with quite a lot of capital, such as OpenAI, is putting out a new intelligence model. The question is, what about the alignment? Is the AI more or less aligned with human goals? That’s so important.

I don’t know the answer to regulation. It sounds like people are saying yes, people are saying no. There are good arguments on both sides. There’s always the bad actor piece. The best models have to be well aligned with humans so that we don’t get at odds with each other.

Guest Q&A: There are many AI companies, lots of startups. From your experience, how do you see the challenges they have when they bring their products to the market? I assume most startups fail. What are the main challenges of going to the market with an AI product? 

Before OpenAI came out with ChatGPT, everybody used to be quite nervous about AI, or at least there was a large segment of buyers that couldn’t comprehend it and were nervous about it. They weren’t quite incentivized to deploy it into the enterprise. OpenAI has changed all of that. Now you see enterprises. I’ve been here in New York for a couple of days. I’m meeting with banks, I’m meeting with record labels. Everybody is talking about AI. Some are more conservative, some are less conservative, but they all have transformation roadmaps. It’s going to be everywhere. The question is just how much time. That’s probably the biggest thing to look at.

If you’re a particular founder and you’re trying to establish product market fit and sales velocity, what’s your customers adoption roadmap? If you’re going to sell to a bank and you’re selling agents, agents are individual workers that can go and help with this automation of various tasks.  Maybe it’s going to be more friction than if you’re selling to Microsoft, who is quite tech adept. I think the buyer persona is the key gate.

But you could start-up a company and it’s an API call to access the model.  It’s available to anyone who wants to innovate on top of that model that they’ve spent billions of dollars building.

The enterprise, especially conservative banks, lock down what APIs can be brought inside. I was at a bank yesterday and they tell me that everyone inside the company is operating as if AI has not happened. Go on with your normal life, use it on your own personal devices, but nothing inside the company. Then inside the company, they’re writing API wrappers, application programming interface, wrappers around these external services like ChatGPT, like Midjourney. Then the Enterprise can say who can use what for how long, etc.

The banks didn’t get on the cloud until 2015 or 2016.

I had a startup recently that came to me and said, “banks are adopting AI at this crazy rate.” I had to go out and see for myself and we didn’t invest in that company.

Guest Q&A: You had mentioned, why isn’t there an AI for that? What are you using in your personal or professional life to make yourself more productive?

There’s a whole stack of things to think about. First is personal and professional.  ChatGPT has a free version, Perplexity AI is quite interesting., my co-founder’s company is impressive.

What does Perplexity do?

It’s AI search. If I’m doing some creative work, I’m going to go talk about AI in front of an audience, how should I entertain that audience? I should probably have an AI copilot give me some ideas that I might not have thought about.

Is this is a programed speech here, are you a real person?

I’m a human. I’m the human in the loop that took the AI feedback and puts out what I want as the master of the message. I have a two-year-old, how do you think about feeding a toddler? How do you think about having a conversation with AI? It’s much easier than Google. I use it professionally, personally on the professional side. I mean incumbents are certainly integrating AI into their workflows. New products are coming out that are specialized around new workflows.

I have a friend who spends an hour or two each night talking to ChatGPT, asking questions about everything. He’s said that he’s learned so much in the last six months on history and other subjects, as you can imagine.

There’s this really interesting insight on AI that Jeff Hinton talked about recently. He’s a prominent AI researcher that recently left Google because he was quite nervous about what’s happening. He basically said, “you know, humans have these neural hookups, the way your brain is wired. But for you and I to exchange information, you have your hookup. I have my hookup.  We’re limited to about 100 words a minute. That’s how fast we can exchange. But if you have two machine intelligences, they can have the same model and you can trade the weights one for one.  Now your information exchange is massive.

Is demand insatiable for GPUs or is it just a one-time rush to build out the initial infrastructure and demand normalizes? If NVIDIA could sell ten times as many chips tomorrow, is there demand for it?

Yes, there’s demand for 1000x more chips. I mean, of course, there’s a downward sloping demand curve. The cheaper it gets, certainly the more the demand will go up. Right now, they could probably 1000x supply and there would be buyers.

What’s Google’s position in the market? They invented the transformer that enabled all of this. How do you see their potential?

There’s whispers, I’ve been talking to some enterprises that say they’ve seen it firsthand. I haven’t seen it firsthand, but there’s whispers of a model called Gemini that is going to outpace what OpenAI has been able to do. In fact, if you go and look at Sam Altman, the CEO of OpenAI’s Twitter/X thread, a couple of weeks ago, he responded to it quite emotionally because it would certainly eclipse anything that OpenAI has been able to do. Again, I haven’t seen it, but I’ve talked to enterprises that have seen it and they say it’s, you know, uncanny.

Anything else I should ask?

What an amazing time to be alive. I mean, we’ve all seen the Internet, we’ve all seen mobile. I think we’re always like, wow, this innovation. What’s to come in the next ten years is going to be breathtaking.

Awesome. Thank you!


Favorite Books & Media

A Conversation with Renaissance Technologies’ CEO Peter Brown

Peter Brown, the CEO of the famously secretive Renaissance Technologies, was recently interviewed by Goldman Sachs.  To my knowledge, this is the only time I’ve ever heard Peter be interviewed in a forum like this.  In the discussion, he walks through a bit of the history and evolution of Renaissance and shares stories from different periods of market volatility.  Peter also shared that Renaissance has been working on what sounds like “generative AI” technology for well-over a decade (in fact, his PhD was focused on language translation models).

Founder’s Podcast: Working with Jeff Bezos

Founder’s Podcast, which is one of my favorite podcasts series and one that I highly recommend, features a nice run down of the key lessons learned from reading the book, Working Backwords: Insights, Stories, and Secrets from Inside Amazon.  Authored by two longtime Amazon insiders, the book shares a number of amazing Bezos’ insights and anecdotes – including Bezos’ “write the press release first” approach to product development, Amazon’s no PowerPoint policy (in favor of written memos), and how Bezos tried to accelerate business agility and speed by cutting down on internal corporate communications.  This is a great listen.

Art of Investing Podcast: Todd Combs – Investing, the Last Liberal Art

One of Warren Buffett’s investment managers and the CEO of GEICO, Todd Combs, has maintained a low profile since joining Berkshire Hathaway back in 2010.  This podcast dives deep into Todd’s career progression, starting off as an insurance regulator before managing much of Progressive Insurance’s data analytics and then starting a hedge fund (with the support of the Blue Ridge Capital eco-system).  Todd also shares some interesting stories about working with Charlie Munger and Buffett.  In short, it is easy to see what Buffett saw in Combs, and it is not hard to see the role he can play in Berkshire’s future as a potential manager of both the company’s investments and insurance operations.

Going Infinite: The Rise and Fall of a New Tycoon by Michael Lewis

I always enjoy Michael Lewis’ books, and this fast-paced story about the rise and fall of FTX and Sam Bankman-Fried (“SBF”) does not disappoint.  I thought Lewis did a good job of painting a picture of SBF’s brilliance, quirkiness, manipulative tendencies, and non-conforming approach to life and business.  He also provided a colorful look at some of the other characters in the FTX/SBF orbit.  However, in Act III of the book (the “Fall” of FTC), I thought Lewis was a bit naïve with some of his financial analysis although I did appreciate his appropriately skeptical look at the (often self-dealing) way that the US bankruptcy system and law firms work.  Worth a quick weekend read.


A Selection of Recent Tweets from @RagingVentures:


Fortuna Audaces Iuvat – Fortune Favors the Bold!