"A playful AR app that brings life into a lifeless 'face' by placing AR speech balloons into the virtual world based on GPS."


“Bringing life into a lifeless face!”

BOOM! is a playful AR product that captures user’s thoughts and feelings, applies them to lifeless faces in the city such as posters and statues, and shares them with other people. It detects human faces of any kind (i.e., posters, pictures, drawings, sculptures or statues, etc) and places the user's thought in a speech balloon next to the detected faces. It uses face detection technology to identify a face, and speech-to-text technology to recognize user’s speech and convert it into texts.



The beginning of this concept arose from a simple thought in our daily scenes in the New York City, a crossroad where arts and cultures meet each other. Everyday we face hundreds of faces in posters, pictures, statues, paintings, even in the logos, not to mention the real human faces. Those faces, especially the inanimated ones, inspired us to imagine the possibilities of communicating with other people in the city by bringing life into these lifeless faces.



There are faces in the digital world as well, through a small, but smart device in everyone’s hand. Facebook or Instagram can introduce us with faces we know as well as the faces that we have never seen in our lives.

Lifeless Faces in Reality World

Lifeless Faces in Reality World

Living Faces in Digital World

Living Faces in Digital World

These faces in the digital world are flattened 2D images of the living people in the real world. They are placed within a rectangular screen and as we scroll down the feed, we see those faces unconsciously thinking that people with those faces are breathing somewhere in this real world. Like this, we confront and consume countless faces of living people, as well as their small notes everyday.

What would it look like if there is a virtual world where lifeless faces can communicate with us? What if posters, statues, sculptures, and mannequin can talk? What if there is a “facebook” in spacetime version?

User Scenario.jpg


To have fun with lifeless faces around us, people need to find them in objects like posters or statues and bring them to life by offering a speaking ability. We had a brainstorming session to examine what motivates people to interact with these lifeless faces (why/motivation), different ways they would be able to interact with the faces (how/technology), and what messages people would want to communicate through the faces (what/current platforms,features,usage).


Finally, we decided to create a mobile app that uses AR technology to capture the moments or to write random things in the virtual world next to the objects of lifeless faces in the real world. The reason we chose the AR feature is that users can have a lot more freedom in their creation of speech balloons in terms of design, content, and composition than when using other traditional methods like using physical stickers.



The AR speech balloons will live in the virtual space where the users can only see through their mobile devices, serving as magic windows. Moreover, we wanted to have the most seamless and fluid interaction, so that users can find this app simple and handy to use without much effort and time. Thus, we decided to implement voice recognition technology to convert the speech into text automatically. 

prototype.jpg (3).gif

In short, users can augment "any object with a human face shape" and bring it to life by capturing and leaving their thoughts into the AR speech bubble by simply talking to their mobile device.




Prototypes & Usability test (1).gif


import UIKit
import ReplayKit
import AVFoundation
import Vision
import Speech
import Photos
import SwiftGifOrigin

    //speech recognizer setup
    let speechRecognizer = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US"))
    var regRequest: SFSpeechAudioBufferRecognitionRequest?
    var regTask: SFSpeechRecognitionTask?
    let avEngine = AVAudioEngine()

    func getPermissionSpeechRecognizer() {
        SFSpeechRecognizer.requestAuthorization { (status) in
            switch status {
            case .authorized:
                self.boomButton.isEnabled = true
            case .denied:
                self.boomButton.isEnabled = false
            case .notDetermined:
                self.boomButton.isEnabled = false
            case .restricted:
                self.boomButton.isEnabled = false
            } (2).gif

Face Detection(Swift)

    //Vision framework objects
    let faceDetection = VNDetectFaceRectanglesRequest()
    let faceLandmarks = VNDetectFaceLandmarksRequest()
    let faceLandmarksDetectionRequest = VNSequenceRequestHandler()
    let faceDetectionRequest = VNSequenceRequestHandler()
    override func viewDidLoad() {
        bubbleMarker.frame = CGRect (x:0, y:0, width: 340, height: 320)
        txtSpeech.frame = CGRect (x:0, y:0, width: 250, height: 120)
        lipMarker.frame = CGRect(x:0, y:0, width:200, height:1200)
        lipMarker.image = UIImage.gif(name: "particle5")
        bubbleMarker.isHidden = true
        txtSpeech.isHidden = true
        lipMarker.isHidden = true

        boomButton.isEnabled = false
        cameraButton.layer.cornerRadius = cameraButton.frame.size.height/3
        cameraButton.layer.masksToBounds = true
        txtSpeech.text = ""
        speechRecognizer?.delegate = self


! 180710_Boom_UX Flow.jpg




Case 1. Single User

Case 2. Multi User


Multi User with Screen View