lip-sync models at a glance

Step into the realm of sync., where our suite of lip-sync models is engineered to animate your videos with naturalistic lip movement, syncing audio to video effortlessly.

capabilities of our models

  • universal fit: models work with any spoken audio and any face, right away.
  • precision: they specifically fine-tune the mouth area to sync with the audio, leaving the rest of the video untouched.

model selection guide


our best model: everything good about sync-1.5 but now with fluid, human-like mouth movements. When to switch to 1.5: (i) you want the mouth movements to be more visibly prominent, (ii) you observe visual artifacts at the scene boundaries.


stable and tested: accurate, high-resolution lipsync that has been proven to be reliable over a large variety of videos. Why choose this: if sync-1.6 model gives you feeble-looking lip movements, this version can perform better.


free to use: low-resolution, fast, lipsync that will be free forever. Not suitable for commercial use.

sync-1; sync-1.5.1-beta

legacy support: now surpassed by a new and better version in the form of sync-1.5.0. This endpoint is only available via API. It automatically upgrades to sync 1.5.0 under-the-hood to serve the best possible results.

understanding model limitations

To get the most out of our models, here’s what you should keep in mind:

  • face visibility: if the face is hidden or turned away, the model won’t sync that section.
  • single face focus: our models currently sync one face at a time. In videos with multiple faces, they may not select the correct one.
  • forward-facing works best: steep angles or side profiles can be tricky for our models.
  • resolution limit: best results are seen with faces up to 512 pixels in size. Knowing these points helps you align your video inputs with our model strengths for superior lip-sync results.