Blinking Detection and Counter Mediapipe Eye Tracking Part 1 and 2

Eyes Tracking with MediaPipe

·

5 min read

How to detect blinks of Eyes with computer vision Algorithm, I will go through that, briefly, in this blog post, In short, we have to identify the difference between Closed Eyes and Open Eyes in each image(frame)

CodeBase

You can find source code on a GitHub Repository , but here I am going simple code snippets.

What we need to get started

  • Python
  • Mediapipe

when you install mediapipe it has some requirements like OpenCV, NumPy etc, for image processing we need OpenCV, no need to install it separately, since it already shipped with mediapipe

If you want know about Mediapipe

Models in Mediapipe

  • Face mesh (here we are going to use )
  • Face Detection
  • Multi-hand tracking
  • Self-segmentation
  • Full Body Pose Estimation
  • many more

Installation of Mediapipe

# Windows machine

pip install mediapipe 

# Linux or mac

pip3 install mediapipe

Landmarks Detection (Face Mesh)

Our main focus will eye in order to extract eyes we need landmarks of Eyes since mediapipe provide us with the landmarks in normalized values, we need to convert them into pixels, or coordinate relative to the image plane. This is a simple function that does nothing but turns normalized into pixel coordinates

Normalized to coordinates

In the Face Mesh we get, 468 landmarks, so have to loop through each landmark, we will have x, and y values, for conversion purpose we need to multiply the width to x, and height to y, results would be pixel coordinates, storing them in the List of Tuples(x,y) coordinates of each landmark.

def landmarksDetection(img, results, draw=False):
    img_height, img_width= img.shape[:2]
    # list[(x,y), (x,y)....]
    mesh_coord = [(int(point.x * img_width), int(point.y * img_height)) for point in results.multi_face_landmarks[0].landmark]

    if draw :
        [cv.circle(img, p, 2, (0,255,0), -1) for p in mesh_coord]

    # returning the list of tuples for each landmark 
    return mesh_coord

The function accepts three Arguments

  • Img image(frame), mat(NumPy)
  • results these are 468 normalized landmarks provide by mediapipe
  • draw it decides if you want to draw a circle on each landmark or not.

Returns List of Tuples contains image coordinate of each landmark.

YouTube Video about Landmarks Detection

in order to detect a blink, we have to consider few things, first of all, here is an image, which shows you the difference between closed and open Eyes, with landmarks draw. Slide1.PNG

When focusing on eyes it clearly shows that, when eyes are open, width and height of eyes landmarks at max, but when it closed, their no effect on the Width of eyes, but changes happing in the Height of Eyes, that is over catch, we gonna target that, to detect the blink of an eye. instead of height and width, we will find the euclidean distance. one cloud as why euclidean distance, it because when we tilt our head a bit, the width and height of eyes will change accordingly in order to avoid that, have to find the distance, form one point the other point.

We will find the ratio of vertical and horizontal distance, which allow us to detect blinks.

# Euclaidean distance
def euclaideanDistance(point, point1):
    x, y = point
    x1, y1 = point1
    distance = math.sqrt((x1 - x)**2 + (y1 - y)**2)
    return distance

First of All, we will select the landmarks, from Eyes landmarks, let me show you in the image, First.

Presentation1.jpg

In above image, show to two lines draw on eyes using landmarks, indicate the distance between two landmarks, horizontally and vertically, when eyes are open, the vertical distance reaches it maximum values, while horizontal remain constant, on the other hand, Eyes are closed then Vertical distance apporches to it minium values,

    # Right eyes
    # horizontal line 
    rh_right = landmarks[right_indices[0]]
    rh_left = landmarks[right_indices[8]]
    # vertical line 
    rv_top = landmarks[right_indices[12]]
    rv_bottom = landmarks[right_indices[4]]

In these code snippets where selecting landmarks for right eyes, to find the distance, between points, visually shown by the above image.

This function, selects the landmarks for horizontal points, and vertical points of Eyes, and find the distance between point, with help of Euclidean Distance, and calculates blink ratio for each, by dividing the horizontal distance with vertical distance, which allows us to detect Blink Of Eye.


# Blinking Ratio
def blinkRatio(img, landmarks, right_indices, left_indices):
    # Right eyes 
    # horizontal line 
    rh_right = landmarks[right_indices[0]]
    rh_left = landmarks[right_indices[8]]
    # vertical line 
    rv_top = landmarks[right_indices[12]]
    rv_bottom = landmarks[right_indices[4]]
    # draw lines on right eyes 
    # cv.line(img, rh_right, rh_left, utils.GREEN, 2)
    # cv.line(img, rv_top, rv_bottom, utils.WHITE, 2)

    # LEFT_EYE 
    # horizontal line 
    lh_right = landmarks[left_indices[0]]
    lh_left = landmarks[left_indices[8]]

    # vertical line 
    lv_top = landmarks[left_indices[12]]
    lv_bottom = landmarks[left_indices[4]]
    # Finding Distance Right Eye
    rhDistance = euclaideanDistance(rh_right, rh_left)
    rvDistance = euclaideanDistance(rv_top, rv_bottom)
    # Finding Distance Left Eye
    lvDistance = euclaideanDistance(lv_top, lv_bottom)
    lhDistance = euclaideanDistance(lh_right, lh_left)

    # Finding ratio of LEFT and Right Eyes
    reRatio = rhDistance/rvDistance
    leRatio = lhDistance/lvDistance
    ratio = (reRatio+leRatio)/2
    return ratio

The function accepts three Arguments

  • Img image(frame), mat(NumPy)
  • landmarks are mesh_coords which return by landmarks detector function
  • right_indices these are nothing but landmarks on Face Mesh, Right Eyes
  • left_indices these are nothing but landmarks on Face Mesh, Left Eyes Returns combined ratio of both Eyes, which allows us to detect blinks

That’s all, If you have any Questions please comment, I would love to reply.