Representing another important milestone in Kakao Brain’s face-swapping research, the company’s ‘Smooth-Swap: A Simple Enhancement for Face-Swapping with Smoothness’ paper will be presented at the upcoming global computer vision conference, CVPR 2022, for the second year in a row. This will include an exclusive oral presentation session reserved for the most outstanding papers among accepted articles (25.33% of 8,161 submissions were accepted this year). At last year’s event, only 4% of accepted papers were given time to do an oral presentation in which Kakao Brain was nominated for its exceptional research paper, ‘HOTR: End-to-End Human-Object Interaction Detection with Transformers.’ This year, not only has ‘Smooth-Swap’ managed to significantly reduce the complexity of its architecture, it also possesses great potential for commercialization, both of which have been recognized and rewarded by the premier computer vision conference.
An accurate and consistent identity gradient is essential to seamlessly changing a person’s identity without sacrificing the image’s high quality. Trained via supervised contrastive loss, ‘Smooth-Swap’ acquires its stable identity gradient by learning embedding with a higher smoothness. These improvements address the earlier model’s weakness of adding handcrafted components and 3D face modeling which ultimately complicated its design and entailed sophisticated hyperparameter tuning. Instead, ‘Smooth-Swap’ relies on a simple U-Net-based architecture with an integrated smooth identity embedder to deliver cutting-edge performance.
The simple architecture and enhanced performance of ‘Smooth-Swap’ have not only made the technology competitive in terms of its commercialization potential and wider application, they also allow it to face more challenging face-swapping scenarios such as face swapping during video playback. ‘Smooth-Swap’ suggests a differentiated identity embedding approach and empowers the generator to create higher-quality images, especially when changing a subject’s face shape. Through Kakao Brain’s ‘Smooth-Swap,’ which enables fast and stable face swapping, it is expected to develop various kinds of digital humans such as virtual influencers, show hosts, and announcers.
“We are proud and excited to unveil the groundbreaking face-swapping technology, ‘Smooth-Swap,’ to the world,” said Kim Il-doo, CEO of Kakao Brain. “I strongly believe this technology will accelerate innovation in the face-swapping sphere, bringing us another step closer to the incredibly immersive metaverse we always dreamed of as wells as the digital human services of the future.”
About Kakao Brain
Kakao Brain is a world-leading AI company boasting unparalleled AI technologies and research & development networks. The company was established by Kakao in 2017 to solve some of the globe’s biggest ‘unthinkable questions’ with solutions enabled by its lifestyle-transforming AI technologies. Constantly driving innovation in the world of technology, Kakao Brain has developed numerous groundbreaking AI services and models designed to enhance quality of life for thousands of people, including minDALL-E, KoGPT, CLIP / ALIGN, and RQ-Transformer. As a global pioneer of AI, Kakao Brain has the responsibility of fostering a vibrant tech community and robust R&D ecosystem as it carries out its mission to form new tech markets with endless potential. For more information, visit https://KakaoBrain.com/.
 Identity embedding is a vector representation of a face image used to compare identities. If the representation vectors (or embedding vectors) of two faces are close enough, their identities are regarded as the same.
 CVPR (Conference on Computer Vision and Pattern Recognition), co-sponsored by Institute of Electrical and Electronics Engineers (IEEE) and The Computer Vision Foundation (CVF) since 1983, is regarded as one of the most acknowledged annual conferences in the computer vision sector , along with European Conference on Computer Vision (ECCV) and International Conference on Computer Vision (ICCV).
 Identity gradient is a learning signal telling the face-swapping model which part must be tuned to change the person’s identity accurately.
SOURCE Kakao Brain