Eyes Tracking with mediapipe

Eyes tracking part3

·

7 min read

How to estimate where the eyes are looking at, with mediapipe and OpenCV python.

This is the continuation of the last post, on eyes blink detector and counter,

The Code Base

The entire source code will be available on the Github repository Eyes Position Estimator mediapipe , Here you will find source code for the different parts because I have created an entire video tutorial on each topic, from basic landmarks detection to the Eyes Position Estimator.

Mentioned Topics are:

  1. Masking
  2. Cropping the Shape
  3. Cropping Custom Shape
  4. Denoise
  5. thresholding ( converting image to binary image)
  6. Counting Pixels, using Numpy
  7. Draw Custom Shape

How to Extract Eyes from frame/Image

Here we gonna crop out the eyes only from the frame, it would be true for any custom shape. First All we need to create an empty image(mask) that could be achieved using NumPy's zeros function, it takes in, image size and bit-depth(memory size each pixel) that would be unsigned-int8 (uint8),

A simple way of creating a mask.

basically, you can create a colour or grayscale mask, you only need the dimension of (size of mask ), end the end mask nothing but image having black pixel.

here you can see we converting image to grayscale using Opencv function cvtColor basically you can convert BGR image to any type of RGB, BGRA, RGBA many more.

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

Gray Scale image as input gray_scale.png we need the size of the image to create a mask, here find the size image, since the image is nothing but NumPy array in Opencv, so we can use shape attribute here, find the dimension of the image, the dimension of image include on another factor, which colour-channels since we are using Grayscale image, which has any other channel, no need here

dim = gray.shape

creating a mask is simple just you have to call the zeros function from NumPy here

  mask = np.zeros(dim, dtype=np.uint8)

to extract any shape, first of all, you need to draw that particular shape on the mask, in white colour like here I did here, basically, here we are using OpenCV's function fillPoly which draw a custom shape, just need NumPy array with type in32, the right_eye_coords is nothing but a list of Tuples, contain the coordinate eyes landmarks since our goal is to extract eyes here, so one can observe that drawing custom shapes for both eyes here, The list converts into NumPy array, we are drawing this on the mask.

 cv.fillPoly(mask, [np.array(right_eye_coords, dtype=np.int32)], 255)
 cv.fillPoly(mask, [np.array(left_eye_coords, dtype=np.int32)], 255)

Draw Eyes landmarks shapes, with White Color mask_1.png

Writing Eyes on the mask

we draw the eyes shape as a white colour shape on the mask, so can right the grayscale image where the mask is white, there is function OpenCV, bitwase_and which will ignore all the black pixel, write the image where the pixels are white, so eyes will be written on the mask, from the grayscale image.

    eyes = cv.bitwise_and(gray, gray, mask=mask)

Eyes Draw on the mask mask_with_eyes1.png

Cropping Custom shape

since we have no option to create Random shapes like eyes, we follow a simple approach that allows us to crop any shape. First of all, we need to find maximum-x, y and minimum-x,y, since we already have the coordinate of shape in the list of tuples, need to get the max and min Values, here we are using the lambda function to find min and max in the list of tuples,

Finding Max and Min Values for Eyes

#Right Eyes 
r_max_x = (max(right_eye_coords, key=lambda item: item[0]))[0]
r_min_x = (min(right_eye_coords, key=lambda item: item[0]))[0]
r_max_y = (max(right_eye_coords, key=lambda item : item[1]))[1]
r_min_y = (min(right_eye_coords, key=lambda item: item[1]))[1]

# For LEFT Eye
l_max_x = (max(left_eye_coords, key=lambda item: item[0]))[0]
l_min_x = (min(left_eye_coords, key=lambda item: item[0]))[0]
l_max_y = (max(left_eye_coords, key=lambda item : item[1]))[1]
l_min_y = (min(left_eye_coords, key=lambda item: item[1]))[1]

Cropping Eyes

You need min_height, max_hieght, min_width and max_width

#Right Eye
cropped_right = eyes[r_min_y: r_max_y, r_min_x: r_max_x]

#Left Eye
cropped_left = eyes[l_min_y: l_max_y, l_min_x: l_max_x]

Right Eyes right.png Left Eyes left.png

Eyes Extractor Function

This Function, Extract the eyes from the input image, It requires the coordinates of the left and right eyes, a list of Tuples, which is explained in, earlier blog post.

This function simply returns the cropped eyes left and right, as the image

def eyesExtractor(img, right_eye_coords, left_eye_coords):
    # converting a colour image to a scale image 
    gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

    # getting the dimension of image 
    dim = gray.shape

    # creating mask from gray scale dim
    mask = np.zeros(dim, dtype=np.uint8)

    # drawing Eyes Shape on mask with white color 
    cv.fillPoly(mask, [np.array(right_eye_coords, dtype=np.int32)], 255)
    cv.fillPoly(mask, [np.array(left_eye_coords, dtype=np.int32)], 255)

    # showing the mask 
    # cv.imshow('mask', mask)

    # draw eyes image on mask, where white shape is 
    eyes = cv.bitwise_and(gray, gray, mask=mask)
    # change black color to gray other than eys 
    # cv.imshow('eyes draw', eyes)
    eyes[mask==0]=155

    # getting minium and maximum x and y  for right and left eyes 
    # For Right Eye 
    r_max_x = (max(right_eye_coords, key=lambda item: item[0]))[0]
    r_min_x = (min(right_eye_coords, key=lambda item: item[0]))[0]
    r_max_y = (max(right_eye_coords, key=lambda item : item[1]))[1]
    r_min_y = (min(right_eye_coords, key=lambda item: item[1]))[1]

    # For LEFT Eye
    l_max_x = (max(left_eye_coords, key=lambda item: item[0]))[0]
    l_min_x = (min(left_eye_coords, key=lambda item: item[0]))[0]
    l_max_y = (max(left_eye_coords, key=lambda item : item[1]))[1]
    l_min_y = (min(left_eye_coords, key=lambda item: item[1]))[1]

    # croping the eyes from mask 
    cropped_right = eyes[r_min_y: r_max_y, r_min_x: r_max_x]
    cropped_left = eyes[l_min_y: l_max_y, l_min_x: l_max_x]

    # returning the cropped eyes 
    return cropped_right, cropped_left

Position Estimator

This function takes in cropped as input argument and returns position, alongside colour for foreground and background of the text.

here we find the width and height of eyes using the shape attribute, cause the image is NumPy array.

h, w =cropped_eye.shape

Removing any noise in the image, with Gaussian blur and median blur.

 gaussain_blur = cv.GaussianBlur(cropped_eye, (9,9),0)
 median_blur = cv.medianBlur(gaussain_blur, 3)

applying a threshold to denoised image, which converts grayscale image to binary image(having black(0) and white(255) pixels only), you can set min and max threshold values to, change in between values to a binary image, it is an OpenCV function that takes in the image(median_blur), min and max, thresholding values(130, 255), and type of thresholding (THRESH_BINARY)

 ret, threshed_eye = cv.threshold(median_blur, 130, 255, cv.THRESH_BINARY)

Dividing the eye into Three parts is identical to, we did cropping earlier, here we keep the height the same, just we are dividing the width into three parts.

piece = int(w/3)  
right_piece = threshed_eye[0:h, 0:piece]
center_piece = threshed_eye[0:h, piece: piece+piece]
left_piece = threshed_eye[0:h, piece +piece:w

Now we are calling the pixel counter function which will return eyes position and a list of colours, for each position, which takes three parts of eyes, as an argument, we will create that function after this.

eye_position, color = pixelCounter(right_piece, center_piece, left_piece)

Here is the complete code for the Eyes position estimator

def positionEstimator(cropped_eye):
    # getting height and width of eye 
    h, w =cropped_eye.shape

    # remove the noise from images
    gaussain_blur = cv.GaussianBlur(cropped_eye, (9,9),0)
    median_blur = cv.medianBlur(gaussain_blur, 3)

    # applying thrsholding to convert 
    ret, threshed_eye = cv.threshold(median_blur, 130, 255, cv.THRESH_BINARY)

    # create fixd part for eye with 
    piece = int(w/3) 

    # slicing the eyes into three parts 
    right_piece = threshed_eye[0:h, 0:piece]
    center_piece = threshed_eye[0:h, piece: piece+piece]
    left_piece = threshed_eye[0:h, piece +piece:w]

    # calling pixel counter function
    eye_position, color = pixelCounter(right_piece, center_piece, left_piece)

    return eye_position, color

The Pixel Counter

This function will count the black pixel for each part of the eyes(right, center, left), find us the maximum black pixel, on the base of that it will decide and return the position of eyes, alongside the colour list.

Counting Pixels. checking if any pixel in each part, then summing the up and storing into a variable, using NumPy's function sum

right_part = np.sum(first_piece==0)
center_part = np.sum(second_piece==0)
left_part = np.sum(third_piece==0)

create a list of pixel total pixels in each part,

eye_parts = [right_part, center_part, left_part]

finding the index of maximum black pixel item of the list,

max_index = eye_parts.index(max(eye_parts))

based on each index, assigning position and color, using simple if and Elif logics

pos_eye =' ' 
if max_index==0:
    pos_eye="RIGHT"
    color=[utils.BLACK, utils.GREEN]
elif max_index==1:
    pos_eye = 'CENTER'
    color = [utils.YELLOW, utils.PINK]
elif max_index ==2:
    pos_eye = 'LEFT'
    color = [utils.GRAY, utils.YELLOW]
else:
     pos_eye="Closed"
     color = [utils.GRAY, utils.YELLOW]

Complete Funciton Code:

def pixelCounter(first_piece, second_piece, third_piece):
    # counting black pixel in each part 
    right_part = np.sum(first_piece==0)
    center_part = np.sum(second_piece==0)
    left_part = np.sum(third_piece==0)
    # creating list of these values
    eye_parts = [right_part, center_part, left_part]

    # getting the index of max values in the list 
    max_index = eye_parts.index(max(eye_parts))
    pos_eye ='' 
    if max_index==0:
        pos_eye="RIGHT"
        color=[utils.BLACK, utils.GREEN]
    elif max_index==1:
        pos_eye = 'CENTER'
        color = [utils.YELLOW, utils.PINK]
    elif max_index ==2:
        pos_eye = 'LEFT'
        color = [utils.GRAY, utils.YELLOW]
    else:
        pos_eye="Closed"
        color = [utils.GRAY, utils.YELLOW]
    return pos_eye, color

these are the function that I have explained, if you need complete source code just go over to my Github repository Eyes Position Estimator mediapipe You can also find Video Tutorial on this project.

I am new to writing, blog post, help me improve writing, by providing feedback in comments, I would love to see, few suggestions. thank you so much, have a nice day <3