We spent the past two days on the Stanford campus attending the Hot Chips conference. This show has a high level of geek cred, with a focus on the very technical sides of developing chips. It was once a venue for purely technical conversations. Like so much in our Fallen AI Age, Hot Chips is starting to attract less desirable finance types such as ourselves, but it is still filled with some heavy technology topics.
There will be a lot of good technical coverage of the event from Dr. Cutress, Serve the Home, and many others. Here we just want to briefly touch on what thought of as the highlights.
Probably the most enjoyable talk came from Eric Quinnell of Tesla. His topic was the new transport protocol they are using to connect their GPUs and accelerators, which we readily admit is not of much interest beyond specialist circles. However, this is a man who knows how to give a talk – humor, insight, drama and props.
Probably the most interesting technical presentation came from Broadcom’s Manesh Mehta who talked about their direct optical interconnect attached to AI accelerators. Co-packaged optics is almost a holy grail for the networking semis segment, and Broadcom’s new system is full of some really impressive technical advances.
The best feature of Hot Chips is the post-talk Q&A sessions with a totally open mic. This is an event where engineers get to ask any question they want and they had a lot of questions for Broadcom. As impressive as their part is, it turns out it still has some wrinkles to iron out. To their credit, Broadcom was open about those problems and even showed their roadmap for getting this part into production, in a few years.
That all being said, by far the most interesting talk came from Open AI, and not interesting in the way that they probably intended. The speaker was Trevor Cai, who we seems to lead the design and scaling of that company’s training systems. He talked a lot about the growth of AI, boiler plate stuff from OpenAI with some debatable comparisons. But when he talked about the real challenges of scaling up AI systems he delivered some incredible insight. Running tens of thousands of GPUs is really hard, and it is no small achievement that OpenAI has gotten as far as they have.
Then came the Q&A. At the outset Cai said he would not discuss Artificial General Intelligence (AGI). So of course, half the questions touched on that subject. OpenAI has become such a high profile company that the audience contained a lot of people with non-technical questions – what are OpenAI’s finances? What are the utilization rates on your servers? What chips are you going to buy next year? When is GPT 5 coming? Tell us all your technical secrets. After a while, we could see Cai’s face begin to express signs of distress, or at least a questioning of how he ended up here. A polite suggestion for OpenAI’s exec team – provide some PR training for your senior staff because the whole world is watching your every move.
After nearly 30 minutes of questions Cai made an important comment. Someone asked if it is possible to predict the time needed to achieve AGI. We imagine his defenses must have been overrun by that point because he actually gave an answer. He said “We cannot predict the timing for AGI, because we are still not clear about defining what that means.” This is a great point, defining intelligence is a complex subject with no easy answers. However, the one person who probably knows the most about scaling large AI systems admitted that no one really knows when AI is coming. In fairness, it could come tomorrow, but it may also never arrive.
Leave a Reply