mtlynch 8 hours ago

This is a fun idea, but I feel like Pong is too simple for the execution to work.

I watched the video, and it seemed like everything it was saying, you could have just pre-programmed for the very limited state space of Pong. It reminded me a little bit of the stock John Madden and Pat Sumerall sound bites that would play during 90s / early 2000s Madden games.

Could you apply the same idea to chess or Texas Hold 'Em? I feel like the additional complexity of those games could lead to more interesting commentary.

  • pncnmnp 8 hours ago

    Author here. I agree with you - the number of metrics I can experiment with in Pong is limited. Chess and Go are next for me.

    Overall, the simplicity of this project has helped me test the waters before diving into more complex territories. The underlying pipeline isn't bad - the approach of collecting events, periodically generating metrics from them, prioritizing them, generating commentary text, queuing those outputs, and then synthesizing speech should serve as the core for similar work.

    It's also given me some intuition on how I can construct an "ecosystem" of data surrounding live action, to add a layer of realism to the narratives.

    • rybosome 3 hours ago

      This is a great premise, and that underlying pipeline you mention sounds like a generally useful system for live commentary with the appropriate abstractions.

      I’m curious to know more about how you retrieve from this ecosystem of data to add color. You mentioned nearest neighbor search, is that over game state? How is the data stored and queried?

      • pncnmnp 3 hours ago

        Absolutely! I can elaborate on that part.

        The code starts by simulating 15 tournament years (like from 2010 to 2024), with each year containing 4 grand slam tournaments - held in a knockout format. There are 64 players in the pool, all starting with an initial ELO score.

        These players compete in the tournaments, with outcomes predicted based on their ELO ratings. ELO is then updated after each match. We rank players solely based on their ELO. Once the simulation completes, it generates a wealth of data. For each game, details such as points scored, points allowed, fastest ball speed, number of aces, point-by-point results, and more are simulated.

        We can then cache and use this information for a ton of color commentary. For example, we can identify the GOATs of the game, highlight players who are performing exceptionally well, pinpoint underdogs, find matches similar to the one currently being played, etc.

        However, I am just scratching the surface. Imagine having a function that considers "age" alongside ELO. Then, you could simulate performance based on age as well - and show things like the younger generation overtaking older players, or veterans still competing despite being past their prime. With a fn like this, you could simulate matches that span the past 75-100 years, generating a ton of nice data to analyze.

        Data itself is not fun - you need nice metrics too - for fun correlations! See https://en.wikipedia.org/wiki/Baseball_statistics. The metrics don’t have to be perfect, after all, humans aren’t perfect. The key is engagement.

        To find similar games, I store and cache all historical matches in a KD-tree, then use a NN search to find similar games - that's quite fast!

        Some commentary can also be dynamically generated at runtime - for example, locker-room whispers. It is important to provide GPT with a decent historical window to avoid generating contradictory info in such cases.

  • htrp 6 hours ago

    >Could you apply the same idea to chess or Texas Hold 'Em? I feel like the additional complexity of those games could lead to more interesting commentary.

    The additional complexity in something like hold 'em lends itself extremely well to LLM generated commentary.

    • vunderba 6 hours ago

      Agreed. Adjusting the LLM temperature to tweak speculation based off the fact that even though the AI commentator has access to all hands, all future draws represent the aspect of "imperfect information" would also be a fun experiment.

  • 93po 8 hours ago

    i think pong being so simple is why this is funny and interesting

    • DigiEggz 7 hours ago

      I agree with this idea. The reason I visited this is because of the idea of commentating Pong is inherently amusing. It makes me realize I'd be down to watch competitive pong.

QRe 12 hours ago

Fun experiment. Main limitation I see is the delay between actions and commentary because of the whole script generation & TTS overhead. It seems like the commentary can quickly fall behind, especially in fast-paced sports.

  • haneul 6 hours ago

    Naw there are tricks you can use to pipeline these things so that apparent latency is under 500ms even with significant game state history awareness, and also to interrupt ongoing but freshly out of date commentary.

    I couldn’t get it under 250ms though (for rocket league), but the tech should be better now than 2024.

  • pncnmnp 9 hours ago

    Author here. TTS and script generation can be a bit of an overhead for now, which is why I've worked with metric aggregates - 30+ bounces rather than exactly 33, for example. For this game, one might ideally want this overhead to be less than the time it takes for the ball to bounce from one paddle to another, which can be around 1–2 seconds. However, there may be another strategy to (maybe?) overcome this: start synthesizing numbers (ignoring the fractional part) using TTS and cache them for both commentators. Then, patch those audio clips together after core part is synthesized. It should be doable, I think - I just haven't gotten to it yet. Note that matching the excitement and tempo of core commentary with those numbers is key - otherwise, it will feel janky.

blakeburch 7 hours ago

Really fun to see! I'd love to have something similar for esports, like League of Legends or Rocket League. So much of the commentary feels like filler with stats and statements about a player.

  • haneul 6 hours ago

    Have done an interactive commentator for rocket league that is also simultaneously your duo partner. Works quite well. This was in October 2024 so the tech is there and even better now.

  • vishalontheline 7 hours ago

    E-Sports needs more commentators from Latin America or the Middle East.

petercooper 12 hours ago

I want this for when I'm working.

"Here we see Peter copying and pasting in some generic quick sort algorithm from.. somewhere. Stack Overflow? ChatGPT? Who knows. And he goes for the compile without writing any tests! Let's see if it compiles first time. And it's a noooooo! Bad luck, let's see how he gets out of this pickle. (I told you he should have written some tests.)"

  • qwertox 11 hours ago

    It raaaaaaannnnnn, no exceptiooooonss!!! Can you believe it??? Can you believe it how he compiled that code? What a beauty, what a beautiful job he's done...

    I wonder if NotebookLM's podcast function could be used for this, to comment on code with the spirit of a Latin American soccer commentator. Because having it comment code is already pretty useful if you don't want to explain others what you have been doing. It can do that pretty well for you.

    • A4ET8a8uTh0_v2 8 hours ago

      Or we could go with David Attenboroough:P

      dolphin mistral output:

      "In the digital ecosystem, where binary code intertwines with human cognition, there exists an important ritual known as the Coding Review. This intricate dance is not dissimilar to how our ancestors gathered around a communal fire, sharing stories and experiences in order to pass on wisdom and understanding of their world.

      The coding review takes place in a carefully-crafted digital habitat - often referred to as a development team's workspace. Here, the code, akin to DNA that carries the blueprint for all life forms, is meticulously examined by a group of highly specialized creatures known as developers and quality assurance analysts."

  • Retr0id 9 hours ago

    Someone could totally make this as a vscode (etc.) extension

  • layer8 7 hours ago

    Or an AI doing the part of the pair programmer who doesn’t have the keyboard.

jart 12 hours ago

Do headline games like John Madden do this? That's a great use case for LLMs.

  • Fripplebubby 8 hours ago

    You might not know this if you don't actually play these games (Madden, 2K for NBA, MLB The Show), but the commentary is extremely high quality, sometimes comparable to the TV broadcast with riffs and tangents as well as describing the action. Over many years of producing these games they have continually refined the process. Of course, eventually you will hear repeating dialogue if you play the games enough, but I think the baseline quality is going to be _very_ hard to replicate with an LLM.

  • IshKebab 11 hours ago

    Yeah I was thinking the same. No more "They've really got to want to win this. This is a game of two halves. Etc."

    Though tbh I found it still pretty annoying. Maybe just the tone of voice though, and it's clearly not actually connected to what's happening in the game.

    I imagine the major sports game players are working on this.

  • jsheard 11 hours ago

    None that I can think of. The Finals has AI generated voiceovers for its announcers, but in that case the lines are pre-written and voice clips generated ahead-of-time so it just reeks of penny-pinching by cutting out real voice actors, rather than using the tech to do things that genuinely weren't possible before.

    https://www.youtube.com/watch?v=kZ87wiHps9s

neilv 12 hours ago

Is a lot of the generated commentary pure fabrication?

  • raffael_de 11 hours ago

    If there is a programmed connection to the physics for the in-game commentary then it should be here: https://github.com/pncnmnp/xpong/blob/main/main.py#L212

    https://github.com/pncnmnp/xpong/blob/main/main.py#L289:

      "- **Shot Angles:** Derive each shot's angle from the (vx, vy) vector:\n"
      "    • Steep angles (>45°) become daring corner lobs or sharp cross-courts.\n"
      "    • Moderate angles (15°-45°) look like graceful arcs that test court coverage.\n"
      "    • Shallow angles (<15°) play out as direct, flat drives down the line.\n"
    
    Didn't find where the balls motion is communicated to the LLM.
    • investa 11 hours ago

      "Real time"

      It does need some pointless anecdotes about past statistics, history of the game, training regimes, new managers and so on!

sim7c00 13 hours ago

ths is so funny my god haha. the intro is a bit dry but when the game is on its fire haha :'). what an exhillarating match xD

smus 7 hours ago

I wouldn't say you taught the ai anything so much as wired some API calls together

ayongpm 13 hours ago

Pretty cool. I can see how commentary could make even Pong more interesting. Maybe there’s room for a pro Pong competition, kind of like what Tetris has.

hvardhan878 6 hours ago

I wonder if you can clone the voice and tonality of Peter Drury and even make a game of Pong emotional.

pawelduda 11 hours ago

Looks like the idea of Morgan Freeman narrating life in real time is closer to reality than ever

  • netsharc 4 hours ago

    Hah, now I'm thinking of an AI that knows your personality and steers you do things you've always been fearful or anxious about. "Steve sees a 10/10 girl, the kind he's always felt inadequate to talk to, but he's a different man now, and he's going to tell her 'Tickle your ass with a feather?' in 5... 4... 3... 2...".

    Hah, my next startup is an AI-Assist Pick-Up Artist. But that's the "Lamborghini-desiring Crypto-Bro" package that's 49.95 USD/month, the entry level feature would encourage you to go to the gym and eat your vegetables.

    AI voices talking to you... now the hallucinations are actually in your head!

danjl 6 hours ago

What's with the circular ball?

  • antonvs 5 hours ago

    It’s 2025. We’ve been able to render small circles for a while now!

    Seriously though, the entire graphics display is much more hi res than the original, and it’s not trying to emulate the original resolution. So one slightly more serious way to answer the question is, all the graphics are higher resolution, it’s just that you notice it more when it comes to the ball.

    • danjl 4 hours ago

      Ok, then where's the ray tracing? ;-)

isaacremuant 10 hours ago

The idea is very fun and kudos on the author but it definitely feels lacking on the actual game commentary itself. My perception, is that it throws random fillers that don't quite feel like apt commentary.

If you're jokingly imitating filler from bad commentary I understand but I think I'd like more play by play and less color, but of course pong has a limited amount of inputs to work with for that commentary.

One thing that could very well work for the latency issue some commenters post is to just send the events and receive commentary outside of the rendering and playback so that it, within some max delay, can look more immediate and in sync.

Very fun idea. Hope to see it with more complex things with more inputs.

oulipo 8 hours ago

Perfect meta-commentary on why AI is 90% useless stuff. Main use-case of AI in the real-world is really not far from commenting a pong match, eg trying to painfully make something exciting out of something worthless and dull, but not succeeding

MontgomeryPy 10 hours ago

Tom Brady's job as a color commentator may be in jeopardy ;)

DonHopkins 11 hours ago

Commentator 1 (Greg “The Swatch Whisperer”): Welcome back, folks, to what can only be described as the pinnacle of human achievement: watching Disney Princess™ Pink paint dry. I haven’t been this excited since the 2002 Home Depot Black Friday Sale when I almost got my hands on a discontinued eggshell Martha Stewart Lavender.

Commentator 2 (Marsha “Two Coats” Hernandez): Greg, I still remember the way you wept in aisle 7. But let’s talk about today’s masterpiece—Disney Princess Pink, the shade officially inspired by the collective inner glow of Aurora, Cinderella, and, dare I say, Ariel's clam-bikini energy.

Greg: Absolutely, Marsha. And look at that glorious semi-damp sheen—like a freshly glazed donut at sunrise. It’s got a dreamy undertone of "your niece’s birthday party at 10 a.m. with a bouncy castle and too much Capri Sun."

Marsha: Oh-ho, what’s this? Is that… yes, I think the lower left quadrant is beginning to matte. Ladies and gentlemen, we may be witnessing the first signs of Stage 3: The Settling of the Pigment.

Greg (choked up): My god… I haven’t seen a transition like this since Elsa’s Let It Go phase. Remember that? How she emotionally dried her entire personality over a solo in under three minutes? Iconic.

Marsha: Speaking of queens, this paint owes everything to Belle’s bedroom in the lost “Live Laugh Library” deleted scene. That’s the shade they were going to use until someone spilled tea on the concept art. Literally. It was Chip. That kid is a menace.

Greg: I’m sorry but—hold on—this is huge. That patch near the window just tightened. We are witnessing micro-shrinkage. It’s subtle, it’s refined, it’s got the attitude of Mulan at a dim sum buffet. She came hungry, and this paint came to DRY.

Marsha: Greg, if this drying pace keeps up, we’re on track for a Suburban First-Timer Finish Time. I haven’t seen Disney Pink behave like this since the infamous 2017 "Frozen Themed Daycare Hallway Incident." They had to repaint in Tiana Teal—the shame.

Greg: And oh! There it is! That final middle patch—she’s going matte, folks. This wall is becoming a canvas of completion, a poetic stillness in a chaotic world. I feel like I just watched Cinderella get her slipper and a Roth IRA.

Marsha (tearfully): This… is why I do this job. For moments like this. For the shimmerless silence. For the slow, glorious commitment to finality.

Greg: And so we leave you, dear viewers, staring into a flat, fully-dry future. The room has changed… and so have we.

  • antonvs 5 hours ago

    > Swatch Whisperer

    According to Google, you’re only the second person in recorded human history to use these two words together.

  • aspenmayer 5 hours ago

    > Greg: And oh! There it is! That final middle patch—she’s going matte, folks. This wall is becoming a canvas of completion, a poetic stillness in a chaotic world. I feel like I just watched Cinderella get her slipper and a Roth IRA.

    > Marsha (tearfully): This… is why I do this job. For moments like this. For the shimmerless silence. For the slow, glorious commitment to finality.

    > Greg: And so we leave you, dear viewers, staring into a flat, fully-dry future. The room has changed… and so have we.

    I’m getting major Broomshakalaka vibes in the best possible way.

    https://www.youtube.com/watch?v=zt2uIhAvQZ8