AI-focused video synthesis, transformed on AWS

Humen.Ai 1

Improving Infrastructure Agility and Cost Efficiency on AWS

Humen.Ai is an AI-focused video synthesis and content creation company that uses deep learning and AI to create personalized, interactive experiences. Its popular iOS app, Sway: Magic Dance, uses state of the art AI modelling to generate video. 

The fun-to-use app generates videos of users based on a source video uploaded by the user doing basic motions, like moving around, kicking their legs or waving their arms. Its PyTorch-backed proprietary GAN Neural Network model creates a digital skeleton of the user. Using that skeleton, users can generate a new, photo-realistic stunt double of themselves dancing like Michael Jackson, twirling like a ballet dancer, or making karate moves like a Black Belt in 30 seconds.


SaaS, Media & Entertainment


Develop a foundation for rapid application releases that will increase infrastructure cost-effectiveness and improve application efficiency and performance on AWS.

Services & Tech

Amazon ECS, Amazon ECR, Spot ASG (containers on ECS), AWS Step Functions, AWS Lambda (serverless)

The Problem

Navigating Infrastructure Scalability and Efficiency Constraints

The app’s operation relied on over 400 AWS G4 on-demand instances controlled by a complex and expensive-to-maintain system that was built internally. During the 2020 Super Bowl, the app was launched in partnership with Doritos after which it hit the number two position in the Apple App Store and, with help from AWS, Humen.Ai was able to scale up to handle the traffic spike.

The successful launch uncovered scalability and efficiency issues in its backend. The AI app’s concept was built on its ability to quickly train models for each customer which required an immense amount of on-demand compute power as is common in any AI-driven application.

“One thing that we were pretty concerned about was the cost of compute in the backend, ” recalled Tinghui Zhou, Co-founder and CEO of Humen.Ai. Our infrastructure is driven by ML which requires the usage of graphics processing units (GPUs) for model training and inference on the cloud.”

In addition to the infrastructure issues, they also faced another common challenge in AI operations. Much like the friction between software developers and engineers, data scientists often face the same friction in moving AI projects from concept through production. AI teams operate much like an R&D team. They are abstracted from engineering once the model is ready to be operationalized. 

This puts the burden on engineering to figure out architecture design, resource management and model monitoring so it runs efficiently, securely and reliably. This friction can slow down releases and hinder the pace of innovation. Humen.Ai wanted to remove engineering barriers to shorten the time between building products and releasing them to users, while keeping their small startup team agile.

The Solution

Addressing Containerization Obstacles Through a Custom AWS Solution

The Humen.Ai team was both extremely proficient in AI/ML operations and in AWS. They had tried a few pathways to optimize infrastructure, like containerization, but those didn’t work out. Amazon SageMaker would have been a great option for model training and inference as it lets you run on Spot Instances, a less expensive route compared to on-demand instances. The trade-off is the additional wait time due to AWS’ need to allocate capacity to run spare training which would hinder the end user experience. Users needed to quickly upload a video, have it immediately processed and be able to generate Instagram filters in minutes.  

As part of the Jumpstart program, AWS referred Humen.Ai to Onica, a Rackspace Technology company for containerization support. The Jumpstart program provides organizations with low-cost infrastructure, credits and training to support growth. Working with Onica, they were able to get to the bottom of the containerization issues. The Onica team helped Humen.Ai containerize its entire AI app to effortlessly deploy on Amazon ECS with Spot Instances. 

Humen.Ai 2

The Onica team then took it a step further by delivering a custom solution built on AWS. Devising a unique method to schedule Amazon ECS containers, Onica empowered Humen.Ai’s machines to do more AI tasks at the same time. This method, combined with managed services for Amazon ECS, enabled it to complete near real-time training, inference, pre-processing and post-processing using higher density machines. “We were definitely on the path of switching to a containerized pipeline. Onica put in substantial effort to make that happen for us quickly,” remarked Zhou.

"Working with a professional team of engineers from Onica on optimizing the backend was overall a very positive experience and really helped us scale our infrastructure during the critical, early stages when we don't have a big, backend engineering team."
Tinghui Zhou
Co-founder and CEO

The Outcome

Building a Foundation for Long Term Growth

Combining this extreme software and hardware integration with the use of Spot Instances has resulted in a potential 70% cost reduction – well beyond the expected goal of a 30% decrease. The Humen.Ai team was so enthusiastic about the new infrastructure design that they began building around the project before it was complete. The Onica team was able to keep up with the speed of the small, agile team moving from proof-of-concept to production in just six weeks. Humen.Ai is now leveraging the new lightweight, efficient infrastructure and newfound agility to create more products and to reinvest in technology. 

Previously, Humen.Ai managed hundreds of instances. By moving to Amazon ECS with Spot Instances, the infrastructure is now easier to manage and operates at lower costs and higher density. The AI startup has been able to replace a whole host of infrastructure with managed services. Onica created a lightweight AI infrastructure that allowed Humen.Ai to manage everything using only serverless technologies. This takes a huge burden off of the engineering team and allows the AI team to quickly innovate and efficiently deploy to production.

According to Zhou, “We’ve doubled, maybe even tripled, our throughput of processing demands from our users.” Instances are better utilized, allowing the machines to run more efficiently. This allows the app to handle more concurrent user requests. “I think that definitely had an impact on the user experience,” Zhou continued. 

With a very small team, Humen.Ai is now able to build instead of just managing what they’ve already built. They have been able to reduce their technical debt to a level where it’s able to move forward. “We were able to bring down costs significantly and it really improved our backend efficiency,” Zhou notes as the biggest outcome of his engagement with Onica. The cost reductions were critical in ensuring the early stage startup’s long-term viability by conserving cash flow. 

With the new infrastructure design in place, Zhou plans to expand Humen.Ai’s dance videos to sports, TV, gaming and movies. Additionally, the team wants to provide a way for users to share their created content within the app for collaboration. Humen.Ai is also trying to improve the photorealism of its output with 3D-based perception for its skeleton and scene-analysis ML models.


Why Onica

Onica is one of the largest and fastest-growing Amazon Web Services (AWS) Premier Consulting Partners in the world, helping companies enable, operate, and innovate in the cloud. From migration strategy to operational excellence and immersive transformation, Onica is a full spectrum AWS integrator. Learn more at