Show HN: Beatsync – perfect audio sync across multiple devices

439 points by freemanjiang 2 months ago

Hi HN! I made Beatsync, an open-source browser-based audio player that syncs audio with millisecond-level accuracy across many devices.

Try it live right now: https://www.beatsync.gg/

The idea is that with no additional hardware, you can turn any group of devices into a full surround sound system. MacBook speakers are particularly good.

Inspired by Network Time Protocol (NTP), I do clock synchronization over websockets and use the Web Audio API to keep audio latency under a few ms.

You can also drag devices around a virtual grid to simulate spatial audio — it changes the volume of each device depending on its distance to a virtual listening source!

I've been working on this project for the past couple of weeks. Would love to hear your thoughts and ideas!

whimsy 2 months ago

This is very, very cool; it's a thing I've been looking for on my backburner for several years. It's a very interesting problem.

There are a ton of directions I can think about you taking it in.

The household application: this one is already pretty directly applicable. Have a bunch of wireless speakers and you should be able to make it sound really good from anywhere, yes? You would probably want support for static configurations, and there's a good chance each client isn't going to be able to run the full suite, but the server can probably still figure out what to send to each client based on timing data.

Relatedly, it would be nice to have a sense of "facing" for the point on the virtual grid and adjust 5.1 channels accordingly, automatically (especially left/right). [Oh, maybe this is already implicit in the grid - "up" is "forward"?]

The party application: this would be a cool trick that would take a lot more work. What if each device could locate itself in actual space automatically and figure out its sync accordingly as it moved? This might not be possible purely with software - especially with just the browser's access to sensors related to high-accuracy location based on, for example, wi-fi sources. However, it would be utterly magical to be able to install an app, join a host, and let your phone join a mob of other phones as individual speakers in everyone's pockets at a party and have positional audio "just work." The "wow" factor would be off the charts.

On a related note, it could be interesting to add a "jukebox" front-end - some way for clients to submit and negotiate tracks for the play queue.

Another idea - account for copper and optical cabling. The latency issue isn't restricted to the clocks that you can see. Adjusting audio timing for long audio cable runs matters a lot in large areas (say, a stadium or performance hall) but it can still matter in house-sized settings, too, depending on how speakers are wired. For a laptop speaker, there's no practical offset between the clock's time and the time as which sound plays, but if the audio output is connected to a cable run, it would be nice - and probably not very hard - to add some static timing offset for the physical layer associated with a particular output (or even channel). It might even be worth it to be able to calculate it for the user. (This speaker is 300 feet away from its output through X meters of copper; figure out my additional latency offset for me.)

camtarn 2 months ago

> This speaker is 300 feet away from its output through X meters of copper; figure out my additional latency offset for me.
0.3 microseconds. The period of a wave at 20kHz (very roughly the highest pitch we can hear) is 50 microseconds. So - more or less insignificant.
Cable latency is basically never an issue for audio. Latency due to speed of sound in air is what you see techs at stadiums and performance halls tuning.
- whimsy 2 months ago
  
  Oh, thanks for correcting me! Now that you mention it, I'm confused by a memory I have. Wired speakers seem to be less common these days but I remember being told about two decades ago that the "proper" way to install speakers was to run out equal lengths of speaker cable (basically just jacketed copper, afaik) to different speakers even if they weren't equidistant in a room. (This was advice for home installation, not stadium-sized installations.)
  Do you suppose there exists some other reason for that, like maybe matching impedance on each cable, or is this likely one of those superstitions that audiophiles fall prey to?
  - rahimnathwani 2 months ago
    
    Superstition for sure.
- superjan 2 months ago
  
  For those wondering: The rule thumb here is that light travels at one foot per nanosecond. 300 ns =0,3 μsec. Electricity is a bit slower but the same order of magnitude.
  - KayEss 2 months ago
    
    And by a happy coincidence it turns out that audio does about one foot in one millisecond making light six orders of magnitude faster
    
    entropie 2 months ago
    
    About one foot sounds very american.
    
    superjan 2 months ago
    
    I’m in europe so I am all in on the metric system. But “about a foot” per nanosecond is so easy to remember, understand and reason about that it is worth the exception. If you prefer something European, think of a sheet of A4 printer paper: the long side is 29.7 cm. “One length of A4 per nanosecond” is within 1% of the actual value of the speed of light.
    
    entropie 2 months ago
    
    > But “about a foot” per nanosecond is so easy to remember
    Well, but its also just wrong. Its 12.5% to low. Thats why "about one foot" sounded absolutely wrong to me.
    
    dspillett 2 months ago
    
    The original comment used imperial measures, following comments kept to that for consistency.
    To put things into proper units: speed of light in vacuum is approx 1.8 terafurlongs per fortnight, and electricity in wires has a pace of similar magnitude, and sound in normal atmospheric conditions shuffles along at approx 2.1 megafurlongs per fortnight.
freemanjiang 2 months ago

Thank you for the kind words! Yeah, I think it gets a lot more complicated once you start dealing with speaker hardware. It pretty much only works for the device's native speaker at the moment.
The instant you start having wireless speakers (eg. bluetooth) or any sort of significant delay between commanding playback and the actual sound coming out, the latency becomes audible.
- raisedbyninjas 2 months ago
  
  For devices with mics, can you have them play a test chirp to measure the latency of Bluetooth or other laggy sound stack?
  - hn8726 2 months ago
    
    Bluetooth audio devices that I use tend to change the protocol as soon as it switches to headset mode (with microphone enabled), which works terribly for music. I imagine the protocol used when the microphone is enabled might have completely different latency characteristics than the one used purely for audio, so a chirp might be measuring completely different thing
    
    sokka_h2otribe 2 months ago
    
    You could use a different device in the swarm for measurement, but yeah it seems pretty quickly complicated! I have no idea as well how stable the latency is
    
    apitman 2 months ago
    
    BTLE has built-in features for latency detection right?
- WhtWsThtAgn 2 months ago
  
  Awesome project!
  If you support mic input, you can allow the user to select a device as the "nexus" with mic recording on. Then you tell each device in your setup to "chirp" at the same exact time, but at different frequencies. Then you can derive the individual device's "local delay" and compensate.
  This allows you to tune the surround setup to full accuracy for a given point in space, and it will take care of ring buffer differences, wireless transfers of non-teathered speakers, etc.
hgomersall 2 months ago

Silent disco in which everyone brings their own source and headphones.
- cypherpunks01 2 months ago
  
  Absolutely! Silent disco still requires impractically expensive rental hardware to work well as far as I know. A lot of them run off FM radio, since it's the simplest way to go, but nobody owns portable radios anymore.
  An OSS app with the ability to sync everyone up over mobile or wifi, on Android or iOS with BYO headphones, would be incredible. This should be a thing :)
  - vladvasiliu 2 months ago
    
    I wonder if something like this (without the OSS part) doesn't already exist. Some cinemas in France have some kind of app for people who are either hearing or visually impaired which allows them to follow the movie.
    I've never seen in action and don't know how it works, but at least for the audio part it should be able to synchronize the phone with the cinema screen.
    If I'm not mistaken, it's provided by this company: https://www.twavox.com/en/
    
    sokka_h2otribe 2 months ago
    
    Roku sticks allow this for t.v watching, via the Roku app. No idea how well it works for audio or more latency sensitive applications.
  - nsteel 2 months ago
    
    Snapcast has a webapp and a native android client. Although I'm not sure how well it handles many, many clients. In theory, if all on the same WiFi they should all play in sync like a silent disco (at least for those not using Bluetooth headphones where the playback latency is too high/not available).
    
    pmontra 2 months ago
    
    Web radios handle many clients. The first problem could be if the Wi-Fi hot spot can handle that many clients. The second one is that web radios and their protocols usually don't care if two clients are not in sync. They are usually in different places, maybe different continents.
    I'm self hosting a web radio for my LAN at home. I set it up years ago, I'm not there so I can't check the details but I think it is: Icecast2 on an ARM small server with DeeFuzzer (sp?) to send my mp3s to the Icecast2 server. MPV or VLC to play music on my Linux laptop and Transistor from F-Droid (I believe)
    
    nsteel 2 months ago
    
    They handle many clients but they absolutely do not play in sync. That's never a requirement for them and I'm not aware of any web radio protocol supporting that feature. Web radio is not the right solution for a silent disco type situation where you can at least guarantee everyone is relatively local.
  - jpc0 2 months ago
    
    I think wifi + WebRTC would take be too hard for this honestly, maybe add wifi6 as a requirement because it can theoretically have lower latency.
- pmontra 2 months ago
  
  "Their own source" looks like they are bringing their own files or (more probably) their Spotify or YouTube. It happens all the time on public transport. Or did you mean bringing their own music and taking turns at sharing it with the other people around? That might be against the terms of service of some services.
  - timdiggerm 2 months ago
    
    Surely, since "silent disco" only really works if everyone is dancing to the same music (which is the only thing that would make sense for a post about synchronizing audio), they're using "source" to mean "device"
- pcthrowaway 2 months ago
  
  I believe the syncing won't work when playing with a bluetooth device
- throawayonthe 2 months ago
  
  [dead]

freemanjiang 2 months ago

I primarily built this for group in-person listening, and that's what the spatial audio controls are for. But what is interesting is that since it only requires the browser, it works across the internet as well. You can guarantee that you and someone else are listening to the same thing even across an ocean.

Someone brought up the idea of an internet radio, which I thought was cool. If you could see a list of all the rooms people are in and tune it to exactly what they're jamming to.

Ne02ptzero 2 months ago

> You can guarantee that you and someone else are listening to the same thing even across an ocean.
How can you guarantee that? NTP fails to guarantee that all clocks are synced inside a datacenter, let alone across an ocean (Did not read the code yet)
EDIT: The wording got me. "Guarantee" & "Perfect" in the post title, and "Millisecond-accurate synchronization" in the README. Cool project!
- moomin 2 months ago
  
  More, the speed of light puts a hard cap on how simultaneous you can be. Wolfram Alpha reckons New York to London is 19ms in a vacuum, more using fibre.
  Going off on a tangent: Back in the days of Live Aid, they tried doing a transatlantic duet. Turns out it’s literally physically impossible because if A songs when they hear B, then B hears A at least 38ms too late, which is too much for the human body to handle and still make music.
  - recursive 2 months ago
    
    It's a less hard problem than the duet. If the round-trip is 38ms, you can estimate that the one-way latency is 19ms. You tell the the other client to play the audio now, and you schedule it for 19ms in the future.
    That's assuming standard OS and hardware and drivers can manage latency with that degree of precision, which I have serious doubts about.
    In a duet, your partner needs to hear you now and you need to hear them now. With pre-recorded audio, you can buffer into the future.
    
    moomin 2 months ago
    
    You’re right that it’s an easier problem, but it’s still trickier than it looks. Remember the point of this is to be listening together. To do that, you need to be able to communicate your reactions. And then you’re back to the 38ms (in practice it’s probably twice that). Either way, at 120bpm that’s over a bar!
    If you _don’t_ have real time communication, then you don’t really need to solve this problem. But the problem is fundamentally unsolvable because the speed of light (in a vacuum) is the speed of causality and, as I say, puts a hard cap on simultaneity. This tends to be regarded as obvious at interstellar distances but it affects us at transatlantic distances too.
    
    recursive 2 months ago
    
    You're basically right, but one 4-beat bar @120bpm is 2000ms.
    Also latency demands on conversation are not nearly as tight as those on music performance. See ubiquitous video conferences.
    
    moomin 2 months ago
    
    My brain ain't working, and yeah, I don't tend to notice transatlantic delays on voice and video calls.
  - amluto 2 months ago
    
    > More, the speed of light puts a hard cap on how simultaneous you can be.
    Special relativity does indeed have something to say about simultaneity.
    > Wolfram Alpha reckons New York to London is 19ms in a vacuum, more using fibre.
    And this is not, in any respect, a limit on simultaneity. If the endpoints are moving at very very very quickly relative to each other, then there are complications. Otherwise you measure that 19ms or so and deal with it.
- freemanjiang 2 months ago
  
  Haha yeah guarantee is a strong word. I just mean that it’s good enough to not be noticeable (even within the same physical room)

daredoes 2 months ago

Have you seen snapcast? That's currently my go-to audio sync solution for running whole house audio. Always open to alternatives, but so far nothing beats the performance and accessibility

freemanjiang 2 months ago

yes but only after posting! it's very cool—i'm actually a little embarrassed to not have seen it before.
they're doing a smarter thing by doing streaming. i don't do any streaming right now.
the upside is that beatsync works in the browser. just a link means no setup is required.

thruflo 2 months ago

This looks really cool, congrats!

Just to share a couple of similar/related projects in case useful for reference:

http://strobe.audio multi-room audio in Elixir

https://www.panaudia.com multi-user spatial audio mixing in Rust

Krei-se 2 months ago

If you do RTP with pulseaudio you can know the latency of all devices and have it synced by design - no extra software needed, device agnostic. If it somehow runs linux it will "just work".

fao_ 2 months ago

Works with pipewire too, although the user-facing docs are pretty sparse
- Krei-se 2 months ago
  Here are my server and client configs needed in case someone comes across this from google. It sets up sinks and sources, so you can just mute it, but it would just play automatic from logon:
  Needed on Server and Clients is an override to a) fix my domain users having the same cookie if its stored in default location and b) make sure the server only starts when the network is REALLY up - the normal network online is a system service only and thus you cannot check for it in a users service. In my case the server runs under a domain users profile.
  ~/.config/systemd/user/pipewire-pulse.service.d/override.conf
  [Unit] After=user-network-wait.service avahi-daemon.service [Service] # this changes the location of the cookie because i use roaming homes for domain clients and each machine would have the same cookie ExecStartPre=/bin/bash -c 'systemctl --user set-environment PULSE_COOKIE=/run/user/$(id -u)/pulse/cookie'
  ~/.config/systemd/user/user-network-wait.service
  [Unit] Description=Wait for Network Connectivity [Service] Type=oneshot # This pings your LAN router and creates a network-online file in /run to pick up ExecStart=/bin/bash -c '[ -f /run/user/$(id -u)/network-online ] || (until ping -c1 10.126.0.1 >/dev/null 2>&1; do sleep 1; done; touch /run/user/$(id -u)/network-online)' [Install] WantedBy=default.target
  Server Pulseaudio:
  Not needed but very useful:
  /etc/pipewire/pipewire-pulse.conf.d/50-networkparty.conf
  context.exec = [ { path = "pactl" args = "load-module module-native-protocol-tcp auth-anonymous=yes listen=10.126.1.1 auth-ip-acl=127.0.0.1;10.126.0.0/16" } ]
  # needed. Note how to to make sure s16le is used across all devices to keep conversion to a minimum and how to name the sink somewhat sane
  /etc/pipewire/pipewire-pulse.conf.d/70-rtp-sender-sink.conf
  context.exec = [ { path = "pactl" args = "load-module module-null-sink sink_name=rtp_sender_sink format=s16le channels=2 rate=48000 sink_properties='device.description=\"RTP Sender Sink\"'" }
  ]
  /etc/pipewire/pipewire-pulse.conf.d/71-rtp-sender-23912611.conf
  context.exec = [ { path = "pactl" args = "load-module module-rtp-send source=rtp_sender_sink.monitor source_ip=10.126.1.1 destination_ip=239.126.1.1 port=5004 inhibit_auto_suspend=always" } ]
  You can play to the sink f.e. in mpd with:
  audio_output { type "pulse" name "RTP Sender Sink Pulse" sink "rtp_sender_sink" }
  Client Pulseaudio:
  /etc/pipewire/pipewire-pulse.conf.d/71-rtp-receiver.conf
  context.exec = [ { path = "pactl" args = "load-module module-rtp-recv sink=combine_sink sap_address=239.126.1.1 latency_msec=64.3750" } ]
  you can play with the latency_msec, journalctl will tell you the lowest fragment if you just put 0 or 1ms here. It needs to be a multiple of that minimum, just experiment. Im fine with this even though 12ms would also work in my lan, but its more stable across the wifi bridge.
  The sap_address on the client may work to select the right multicast address even though its actually for the SAP announcements but don't count on that, i have not tested multiple streams so far and would not use "magic" solutions like SAP on the server (and they didn't work in my case and seem pipewire-only). Right now the client seems to pick the right stream - experiment ;)
  The sink in my case is a module-combo, just check with pactl list sinks which sink you want the stream to play on. Note that this is not some application you can dynamically assign to other sinks!!
  For LAN, if you run openwrt just enable igmp_snooping and multicast_querier on the softwarebridge (Luci --> Network --> Interfaces --> Tab Devices) and maybe Multi to Uni in your wifi advanced settings. I dont use this though as my wifi is another vlan or WDS-bridged so i stay out of these problems mostly.
  There are more advanced settings possible with openwrt, including having working igmp_snooping on the hardware switch, if you are interested frequent my documentation (german) on Krei.se as i will write a guide for this sometime lol (or just ask me by DM). Its possible to run this ms-exact with clean network in any case, there is no need to install extra software or clog unused ports with multicast-traffic. If you are perfect about this the music will flow like water through your LAN only where its needed.
worthless-trash 2 months ago

Careful i think the apple marketing department believes they have the exclusive use of the "just work" moniker.
- Krei-se 2 months ago
  
  Haha, well you have to set up multicast in your LAN like a good plumber, then fight context.exec and context.modules - also "just works" means in my case setting up a systemd After=network-watchdog in pulseaudios override that checks whether network is up in userspace.
  But Poettering Software (systemd/pulseaudio) is quite composable so even though there is a learning curve the alternative are 20k config file monoliths.
  Still this even turns rooted LG TVs and cheap raspi picos into sinks.
  The only latency i now have is the bass traveling slower than the trebles lol

Dwedit 2 months ago

How does it deal with the audio ring buffers on the various devices? Does it just try to start them all at the same time, or does it take into account the sample position within the buffer?

freemanjiang 2 months ago

Great question! There's two steps:
First, I do clock synchronization with a central server so that all clients can agree on a time reference.
Then, instead of directly manipulating the hardware audio ring buffers (which browsers don't allow), I use the Web Audio API's scheduling system to play audio in the future at a specific start time, on all devices.
So a central server relays messages from clients, telling them when to start and which sample position in the buffer to start from.
- camtarn 2 months ago
  
  Interesting. Feels like this might still have some noticeable tens-of-millisends latency on Windows, where the default audio drivers still have high latency. The browser may intend to play the sound at time t, but when it calls Windows's API to play the sound I'm guessing it doesn't apply a negative time offset?
- serial_dev 2 months ago
  
  So it doesn't need to use the microphone? I guess from the "works across the ocean" comment and based on this description. I would have thought you would listen to the mic and sync based on surrounding audio somehow but it's good to know that it's not needed.
  - freemanjiang 2 months ago
    
    Yup no microphone. It's all clock sync
cosmotic 2 months ago

Another issue is seeking in compressed audio. When seeking (to sync), some API's snap to frame boundaries.
- cosmotic 2 months ago
  
  I solved this by decompressing the whole file into memory as PCM.
brcmthrowaway 2 months ago

This is my question, does it do interpolation or pitch bending

TowerTall 2 months ago

Could have used this 25 years ago when I was working in a large room with ~100 other people. Every friday an mp3 was distributed and then at the same time we all started playing it signaling that the workday has ended and the friday bar was open. Fun times.

Groxx 2 months ago

Impressively accurate - Android phone in Firefox <-> Chrome on OSX == basically perfect to my ear. That's super cool, thanks for sharing!

cypherpunks01 2 months ago

For fun I tried syncing over Tor as well. It works impressively well! Amazingly tight sync considering the latency is 3 random hops around the world.

maxmynter95 2 months ago

It's a really intereseting vibe when you play on multiple machines. Sometimes you can notice a slight off-ness which gives this reverb effect.

rezonant 2 months ago

It's not open source until you pick a license. Since there is no license in this repository, it is at best source-available.

freemanjiang 2 months ago

Thanks for the heads up! Just added a license to the repo.

hackncheese 2 months ago

Any plans to integrate this with Apple Music or Spotify? I would assume your algorithm would work only with files uploaded to the site, but curious if you had plans to attempt something with Apple Music/Spotify

freemanjiang 2 months ago

Yes! The very next step.
- alexweej_ 2 months ago
  
  This is kind of where my attempt at this idea during lockdown died... Copyright law
- iamsaitam 2 months ago
  
  The challenge here is not technical but legal, good luck.

dchristian 2 months ago

Watch out of patent problems. There was a major dust up between Sonos and Google over audio sync technology. Disclaimer: I've worked for both companies, but not on that

lacoolj 2 months ago

Very very cool idea, but this is a bummer: "Optimized for Chrome on macOS. Unstable for other platforms..."

Once that changes (at the very least, the macOS part), I can't wait to play with it!

freemanjiang 2 months ago

It works on other platforms! Just not as smooth as Chrome.
- freemanjiang 2 months ago
  
  Made an update so it should be good on most!
  - lacoolj 2 months ago
    
    Yes! I am playing with it on my two phones and PC and it's absolutely wonderful
    Not a huge deal, but if I start playing from my Pixel 4, it staggers the startup between devices (either pixel is faster or vice versa).
    If this starts working with spotify somehow, I'll be set for my weekends coding around the house!
- Michael9876 2 months ago
  
  [dead]

matteason 2 months ago

This is really cool, thanks for sharing. I've got a couple of pseudo-radio stations on https://ambiph.one/ which are very roughly synchronised for all users but it's based on their device clock so it can get a couple of seconds out of sync very easily. Looking forward to picking through your code to see if there are any techniques I can borrow!

jauntywundrkind 2 months ago

Unfortunately the w3c webtiming community group has closed. It'd be amazing to have the browser better able to keep time in sync across devices.

https://www.w3.org/community/webtiming/

https://github.com/webtiming/timingobject

duped 2 months ago

Luckily the audio industry as solved this problem, and they use PTP as the clocking mechanism for AES67 (kind of the bastard child of Ravenna and Dante, but with a fully open* AoIP protocol) that's designed for handling all the hard parts of sync'ing audio over a network. And it's used everywhere these days, but mostly in venues/stadiums/theme parks.
* open if you pay membership dues to the AES or buy the spec
- jauntywundrkind 2 months ago
  
  Hopefully wifi8 has something PTP built-in. I hear there's some vague hope that better timing info is one of the core pieces, so maybe maybe!
  I'm super jazzed seeing AES67 emerge.. although it not working great over wifi for lack of proper timing info hurts. Very understandable for professional gear, but there's nothing I love more than seeing professional, prosumer and consumer gear blend together!
  PipeWire already has pretty decent support! There's a tracker where people report on with their hardware experiences trying it. Some really really interesting hardware shows up here (and elsewhere on the gitlab): https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/32...
- doughecka 2 months ago
  
  Well, AES67 isn't a great secret... It's just RTP with PTPv2, with some predefined codec, sample rate, and fpp options.

RicoElectrico 2 months ago

Does this resync periodically? (I mean not only when a new track starts)

freemanjiang 2 months ago

It doesn't at the moment, but I think it probably should. There's a non-trivial amount of clock drift that can happen over long periods of time.

bjackman 2 months ago

Very cool! As someone who doesn't know much about the topic, I'm surprised that "millisecond-level accuracy" is enough. I would have imagined that you need to be accurate down to some fairly small multiple of the sample rate to avoid phasing effects.

Do you have any interesting insight into that question?

cesaref 2 months ago

If you look at professional distributed audio systems (Dante, AES67 etc) you'll find that they all require PTP support on the hardware to achieve the required timing accuracy, so yes, you need <1ms to get to the point of being considered suitable if you are doing anything which involves, say, mixing multiple streams together and avoiding phasing type effects.
However, it very much depends on what your expectations are, and how critical your listening is. If no one is looking for problems, it can be made to work well enough.
freemanjiang 2 months ago

Yeah the threshold is pretty brutal, but it is enough. Experimentally, I'd say you need under 2-3ms but even at 1ms you can start to hear some phase differences.
Most of the time, I think my synchronization algorithm is actually sub-1ms, but it can be worse depending on unstable network conditions.
- mkishi 2 months ago
  
  How are you measuring this? I'm surprised the Web Audio API scheduling system has that much insight into the hardware latency.
  - urbandw311er 2 months ago
    
    I was wondering that too. It’s an impressive demo when used on devices with low latency audio drivers but I’m not convinced there’s any ability to detect drift beyond this. Might be interesting to have an option to use microphones to detect and calibrate this… …but then you have the same issue of an unknown delay on the microphone input too.
hatthew 2 months ago

Sound travels at a speed of ~1 foot/millisecond
- camtarn 2 months ago
  
  Oh, that's a nice approximation! Similar to Grace Hopper's famous demo of a six inch wire being about how far electrical signals travel in a nanosecond.
- bjackman 2 months ago
  
  Wow it's insane how slow that is!
  But also, I don't really have an intuition for why the speed of travel is relevant here?
  It's funny that I have a natural intuition that sound is slow over long distances, yet 1 ft/s still feels astonishingly slow. And yet while I know light travels 1ft/ns, it's still astonishing that it takes 30ms to travel from London to Sydney at that speed.
  - hatthew 2 months ago
    
    sorry for late response
    What I mean is that if you have multiple speakers around you, if any one is 1 foot closer or further than another, it'll be off by a millisecond already. Given that most people probably aren't going to have sub-foot positioning accuracy, sub-millisecond timing accuracy isn't critically important.

chaosprint 2 months ago

have you considered using webtransport?

When I was developing Glicol (https://glicol.org/) sync, the main challenge is network jitter. Had to give it up eventually.

Furthermore, have you factored in the synchronization as perceived by the listener?

Also, it seems system-level differences, particularly in audio output latency across various OS and hardware setups, would need to be considered.

What I mean is, the variation in inherent audio output latency between different systems (e.g., Mac vs. Windows, different hardware) could easily exceed 10ms in itself.

freemanjiang 2 months ago

Practically, the network jitter is averaged out in the clock synchronization calculations, and even output latency is remarkably well-behaved. Have you tried it on different devices? It is only noticeable when there's an external device connected to the computer.

joelkoen 2 months ago

This is the most impressive demo I've ever seen - no app download, no account sign up, no crap, just works instantly. Well done.

freemanjiang 2 months ago

Thank you! Very appreciated kind stranger :)

pete1302 2 months ago

Great solution based on websocket!.

My Vision: A web based VLC-type webplayer (capable of VLC level features) with support to distribute Audio channels over connect devices.

Here me out: - Mac as display(Movies screen) - iPad as a Center channel - 4 iPhones as LR and rear channels, (and something for LFE).

Is it Practical? sound cool in my head. What do you guys think??

dsr_ 2 months ago

It is not practical unless you already have all the devices.
Let's suppose that you are paying half the new price for the bottom-tier of each of these:
MBA: 500
iPad mini: 250
iPhone 16e: 300 * 4
Your budget is $1950. For that, you can get:
A 50" 4K TV, a Denon X1700H 7.2 receiver, 6 Klipsch R51M speakers, and a half-decent subwoofer, all new.
This will provide a far superior experience for you and a half-dozen friends, and each part will last longer, have no permanent battery to wear out, and be upgradable independently and without relying on a specific software product. I would estimate the lifetime of your proposal at about 3 years, and of my counter-proposal at 20 years.
Yours is much more portable.
- pete1302 2 months ago
  
  I was eagerly expecting such practical breakdown of the Vision. Thanks for the numbers.
  But, the scenario is of a Open-Source solution for a Student living in a Hostel/ Home.
  The Vision is tilted towards Hostel Dorms where you won't bother a Home Theater, but where every friend has Apple device( any robust device with good onboard audio).
  My Friend himself has a MBPRO and 2 iPhones, Totalling to 6 Apple device( Audio Sinks).

Aldipower 2 months ago

This is very very cool! Love it, interface, demo, no need to download anything. Impressive.

Are you already doing latency compensation? You could measure the latency, if one host will become a master and then you could compensate that by delaying the playback of the master a little bit.

amluto 2 months ago

Can this deal with latency from browser to actual output device?

(As an egregious example, AirPlay 2 has excellent audio sync but latency that is a good fraction of a second or even worse. A browser might be playing through AirPlay.)

radley 2 months ago

Cool, keep it up!

For anyone who's curious, Airfoil (a paid app) can play simultaneously from a Mac to a variety of devices:

https://rogueamoeba.com/airfoil/mac/

withinboredom 2 months ago

Works pretty good. There is noticeable desync when one device is plugged into speakers though (at least on windows). It'd be great if you had some way to get a specific device's audio lag or at least let us edit that aspect.

freemanjiang 2 months ago

Yeah, at one point I had manual controls to adjust the delay to debug. I'll add that back.

emilfihlman 2 months ago

Your css is broken in that it doesn't take into account the url/menu bar on phones.

Yes it's a super annoying problem. You should change the css so that the url bar is always visible, and have a separate full screen button.

freemanjiang 2 months ago

Oh I see, yeah I was wondering why it looked like that since on my computer responsive view looked great. Will look into this, thank you.

dragonw 2 months ago

I wonder if the technique could be adapted to work without the WebSocket server, and instead have the devices coordinate the clock offsets using peer-to-peer WebRTC connections.

LordGrignard 2 months ago

hello! the app looks very polished and I'm sure there's a lot of usecases of this for everyone else, but for me I wanted to ask whether this can be used to sync playlist progress of your offline library (its flac ofc) across devices? its something I have not found a solution for at all, other than some Plex thingy which is paid, and if you're synchronizing for millisecond accuracy it should work for simply keeping track of the shuffle order of the playlist and last played song (I. e. the position in the ordered playlist)?

fitsumbelay 2 months ago

This has been popping up in various feeds of mine since yesterday (Mon 4/28)

Although I know nothing about NTP or networking really I appreciate the use of Boring Old Tech for making this awesome software

h2zizzle 2 months ago

Glad to see that one scene from Rainbows End/the thing I had to tell every Best Buy customer was impossible FINALLY become a reality. P3 All-Out Attack time. Kudos.

slmkbh 2 months ago

I did something similar with pulseaudio about 15 years ago, had an old thinkpad running Debian, and then multicast activated on my source. Worked surprisingly well!

alana314 2 months ago

This works insanely well. What is your intended use for this? Multi room playback? Listening to music with friends?

awongh 2 months ago

TIL what NTP is. Interesting to read about how the underlying algorithm works, and that they had pretty good accuracy 40 years ago.

_joel 2 months ago

Wait until you hear about PTP :)

_joel 2 months ago

Really cool but issues for me when uploading mixes (MP3's about 90 mins long, 150MB). They never make it to the playlists

xnickb 2 months ago

if it's behind nginx, check client_max_body_size. Set it to something big
- _joel 2 months ago
  
  Yea could be, this was using their .gg hosted version, I'll try locally
  - freemanjiang 2 months ago
    
    Thanks for bringing this up! Made an issue to look into this.

gkanai 2 months ago

Interesting idea.

Have you thought about integrating support for timecode? Dante support also might bring your software to professional venues.

gitroom 2 months ago

Pretty cool, just the fact it works instantly in the browser with nothing to download is actually kinda wild to me tbh.

HelloUsername 2 months ago

Cool! I'd swap the 'search music' (cobalt.tools) button with the 'upload audio' button

crunchwrapjs 2 months ago

i've been wanting to make this for so long! it's crazy that it's done completely in the browser

fxtentacle 2 months ago

How do you handle different playback devices having small clock speed differences between them?

kelvinzhang 2 months ago

The sync was so seamless I didn't even realize it was playing from my own device at first

freemanjiang 2 months ago

kelvin!

ajb 2 months ago

That's cool!

Last I heard safari was buggy and behind on web audio - did you run into any issues there?

freemanjiang 2 months ago

Miraculously pulled it off with a change I made today

js4ever 2 months ago

Love it, this is impressive and very smart, no need for mic!

badmonster 2 months ago

how does it achieve millisecond-accurate multi-device audio synchronization across browsers?

krick 2 months ago

Very cool, but lacks volume controls.

dddw 2 months ago

Good demo!

djkesu 2 months ago

This is very cool.

johng 2 months ago

Very cool!

freemanjiang 2 months ago

thank you!

ng-henry 2 months ago

this looks so cool!

shahanneda 2 months ago

awesome!

imcritic 2 months ago

Nice idea, but this project is currently written in Typescript, so I view it as a prototype at best.