Drowsiness (Sleep) Detection Using Machine Learning – Part 4 of 5

Methods used:


Cascade classifier is a class to detect objects in a video stream. Likewise we can use .load or .detectMultiScale functions as well. .load is to load a .xml classifier file. It can be either Haar or an LBP classifier. And .detectMultiScale is to perform detection.
This is the default constructor capture the video. And can be overloaded as preferred. In this project, this method is used with an index. This allows opening a camera to capture a video. If you want to open the default camera, then just pass ‘0’ as the index.
This converts an image from one color space to another. The function converts an input image from one color space to another. In the case of a transformation to-from RGB color space, the order of the channels should be specified explicitly (RGB or BGR). Note that the default color format in OpenCV is often referred to as RGB but it is BGR (the bytes are reversed). So the first byte in a standard (24-bit) color image will be an 8-bit Blue component, the second byte will be Green, and the third byte will be Red. The fourth, fifth, and sixth bytes would then be the second pixel (Blue, then Green, then Red), and so on. The conventional ranges for R, G, and B channel values are:

  • 0 to 255 for CV_8U images
  • 0 to 65535 for CV_16U images
  • 0 to 1 for CV_32F images

In the case of linear transformations, the range does not matter. But in case of a non-linear transformation, an input RGB image should be normalized to the proper value range to get the correct results, for example, for RGB → L*u*v* transformation. For example, if you have a 32-bit floating-point image directly converted from an 8-bit image without any scaling, then it will have the 0..255 value range instead of 0..1 assumed by the function. So, before calling cvtColor, you need first to scale the image down.

This is to convert the colored image to grayscale.
This denotes the font type.
This is a method used to draw a rectangle on any image.
This method is used to draw a text string on any image.

cv2.putText(image, text, org, font, fontScale, color[, thickness[, lineType[, bottomLeftOrigin]]])

The function imshow displays an image in the specified window. If the window was created with the cv:: WINDOW_AUTOSIZE flag, the image is shown with its original size, however, it is still limited by the screen resolution. Otherwise, the image is scaled to fit the window. The function may scale the image, depending on its depth.
Whenever you call this method, the system waits for a pressed key. The function waitKey waits for a key event infinitely (when delay≤0 ) or delay milliseconds when it is positive. Since the OS has a minimum time between switching threads, the function will not wait exactly delay ms, it will wait at least delay ms, depending on what else is running on your computer at that time. It returns the code of the pressed key or -1 if no key was pressed before the specified time had elapsed.

This is the fourth out of a series of five articles. See you tomorrow with more! Stay tuned!!!

Day 01

Day 02

Day 03 –

Day 04 –

Day 05

Full code available at:

Spread the love!

Praveeni Chethana

Related post


Comments are closed.