LED - 4. Let's make a face detacting led display - Part 2 (Nvidia Jetson Nano for face detection)

There are a variety of solutions for facial recognition. There are ways to use Raspberry Pi + Google Coral, and Raspberry Pi + Intel Movidius. You can also use Nvidia's Jetson series. In this post, I will take advantage of the cheapest Nano in the NVidia Jetson series. You can find a comparison of the pros and cons of these solution combinations on my other blog. If you are not familiar with Edge AI, please refer to this article.

Prepare Jetson Nano

I'm going to use Jetson Nano with Jetpack 4.3. The OS installation and initial setup are discussed in detail in my other post.

Gain basic knowledge about the Jetson Nano. See my blog for help.
Prepare SD card Image and proceed with the initial setup. . See my blog for help.
Install the NVIDIA DNN vision library. See my blog for help.
Learn how to use the NVIDIA DNN vision library for facial recognition. See my blog for help.
Connect webcam and test it. See my blog for help.

Be careful : If you have no experience with Jetson Nano or you lack the concept of deep learning, it can be difficult to troubleshoot problems you may encounter during installation.

Make face recognition program

If you have successfully installed the NVIDIA DNN vision library, you can easily implement a facial recognition program using a webcam. Refer to the blog's example to make it easy. Unlike other programs, it takes a lot of time to load a deep learning network. So don't panic if your program doesn't run right away and a lot of output passes.

#!/usr/bin/python3
#
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
#
#
# You should calculate the Width, Height / distance ratio. (This video may help you : https://www.youtube.com/watch?v=K9PUCmdXlQc)
# This value will vary depending on your camera type, resolution settings.
# In my case (Logitec HD 720P) 1280X720, 200mm(W)/ 220mm Distance -> 50 degree view angle (Even though camera specification says 60 degree)
#                               640X480, 150mm(W)/ 230mm Distance -> 43.6 degree view angle

import jetson.inference
import jetson.utils
from socket import *
import argparse
import sys, time
import numpy as np
import cv2
# parse the command line
parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.", 
         formatter_class=argparse.RawTextHelpFormatter, epilog=jetson.inference.detectNet.Usage())

parser.add_argument("--network", type=str, default="ssd-mobilenet-v2", help="pre-trained model to load (see below for options)")
parser.add_argument("--overlay", type=str, default="box,labels,conf", help="detection overlay flags (e.g. --overlay=box,labels,conf)\nvalid combinations are:  'box', 'labels', 'conf', 'none'")
parser.add_argument("--threshold", type=float, default=0.3, help="minimum detection threshold to use") 
parser.add_argument("--camera", type=str, default="/dev/video0", help="index of the MIPI CSI camera to use (e.g. CSI camera 0)\nor for VL42 cameras, the /dev/video device to use.\nby default, MIPI CSI camera 0 will be used.")
parser.add_argument("--width", type=int, default=640, help="desired width of camera stream (default is 640 pixels)")
parser.add_argument("--height", type=int, default=480, help="desired height of camera stream (default is 480 pixels)")

try:
 opt = parser.parse_known_args()[0]
except:
 print("")
 parser.print_help()
 sys.exit(0)


def calculate_size(w, h):
    return w * h

def find_big_face(detections):
    big_index = 0
    big_size =  0
    big_w = 0
    big_h = 0
    index = 0
    for detection in detections:
        size =  calculate_size(detection.Right - detection.Left, detection.Bottom - detection.Top)
        if(size > big_size):
            big_size = size
            big_w = detection.Right - detection.Left
            big_h = detection.Bottom - detection.Top
            big_index = index
        index += 1    
    return big_index

'''
find the center of the face
then calculate the angle of the center point
'''
def detect_face_angle(right, left, bottom, top):
    w_center = left + (right - left) / 2.0
    h_center = top + (bottom - top) / 2.0
    w_pixel = center[0] - w_center
    h_pixel = center[1] - h_center
    return w_pixel*pixel_degree, h_pixel*pixel_degree

#center of image
center = (int(opt.width / 2), int(opt.height / 2))
#1 pixel degree of my webcam (Logitec HD 720P 640X480)
pixel_degree = 43.6 / opt.width

# load the object detection network
net = jetson.inference.detectNet('facenet', threshold = opt.threshold)
# create the camera and display
camera = jetson.utils.gstCamera(opt.width, opt.height, opt.camera)
display = jetson.utils.glDisplay()

count = 0
img, width, height = camera.CaptureRGBA()
print("========== Capture Width:%d Height:%d ==========="%(width, height))
RPi_IP = '192.168.11.54'
RPi_PORT = 9090

RPi_sock = socket(AF_INET, SOCK_DGRAM)
# process frames until user exits
angle_threshold = [2.5, 1.5]
current_angle = [0.0, 0.0]
while True:
    try:
        s = time.time()
        move = False
        # capture the image
        img, width, height = camera.CaptureRGBA()

        # detect objects in the image (with overlay)
        detections = net.Detect(img, width, height, opt.overlay)

        # print the detections
        print("detected {:d} objects in image".format(len(detections)))
        if len(detections) == 0:
            continue

        fps = 1.0 / ( time.time() -s) 
        big_index = find_big_face(detections)
        w_angle, h_angle = detect_face_angle(detections[big_index].Right, detections[big_index].Left, detections[big_index].Bottom, detections[big_index].Top)
        if abs(w_angle - current_angle[0]) > angle_threshold[0]:
            move = True
            current_angle[0] = w_angle
        if abs(h_angle - current_angle[1]) > angle_threshold[1]:
            move = True
            current_angle[1] = h_angle

        display.RenderOnce(img, width, height)
        if move == True:
            print("FPS:%f , Big Face Angle H:%f degree V:%f degree"%(fps,w_angle,h_angle))
            packet = "%d,%d,%d,%d,%f,%f"%(detections[big_index].Right, detections[big_index].Left, detections[big_index].Bottom, detections[big_index].Top,current_angle[0], current_angle[1])
            RPi_sock.sendto(packet.encode(), (RPi_IP, RPi_PORT))

        # print out performance info
        # net.PrintProfilerTimes()
        count += 1
    except KeyboardInterrupt:
        break

Run the above Python code on Jetson Nano with Jetpack 4.3. You need to check the webcam screen yourself, so if possible, run it directly on Ubuntu on the Jetson Nano.

UDP communication is used to send the recognized face information to the Raspberry Pi. In general, UDP communication has lower transmission reliability than TCP. However, if some of the packets are lost and there is no problem in service, it is recommended to use UDP for light and fast transmission. For this reason, UDP is often used for multimedia data transmission and IOT data transmission.

Implementing UDP communication in Python is very simple. In just a few lines of code.

from socket import *

RPi_IP = '192.168.11.54'
RPi_PORT = 9090
RPi_sock = socket(AF_INET, SOCK_DGRAM)
RPi_sock.sendto(packet.encode(), (RPi_IP, RPi_PORT))

Wrapping up

If you're not familiar with Jetson Nano and Edge AI, Dustin Franklin (Jetson Developer Evangelist, NVIDIA) 's site(https://github.com/dusty-nv/jetson-inference) will help you.
Dustin Franklin's YouTube video will also help.

And I hope my blog about the Jetson series is also helpful.
The next post will focus on the Raspberry Pi. The face information sent from the Jetson Nano will be used to adjust the animation eye position on the RGB LED display so that they can face each other.

You can download the source codes here(https://github.com/raspberry-pi-maker/IoT)

이 블로그 검색

IoT For Makers