Built-in AI Detection Program

Contents

1

Introduction

AIBOX OS comes with a built-in AI object detection program configured to detect people.
This page explains this program and provides instructions on how to modify it to detect other objects such as animals or vehicles.

Please make sure you are logged into AIBOX OS and have the ability to view and edit files.

2

File Configuration List

AIBOX has a location where the detection program can be placed as follows:

Installation Location
/home/cap/aicap/extmod
Hint

The location under /home is not configured as a RAM disk, so there is no need to disable the RAM disk before making changes.

An object detection program using YOLO11 is installed here by default.

For more information about YOLO11, see here

The files provided are as follows:

GitHub Available on GitHub
File NameDescription
start.shShell script called when the main AIBOX program starts
stop.shShell script called when the main AIBOX program terminates
extmod.pyPython program that performs object detection
yolo11m_ncnn_model YOLO11 model (yolo11m.pt) exported in NCNN format

For more information about NCNN Export in YOLO11, see here

docker-compose.ymlFile containing Docker container configuration and settings

start.sh/stop.sh contain the necessary processing for automatic execution and termination of the detection program. In this program, the processing to start/stop containers using docker compose is written.

Object detection is performed in Python using YOLO11, but instead of using the standard model, we use a lightweight and faster NCNN format conversion.

3

About Docker

AIBOX OS comes with a Docker Image pre-registered that can run YOLO11 with NCNN.

docker images
REPOSITORY TAG SIZE aicap/arm64/ultralytics 1.0.250923 3.43GB

This image is based on the YOLO11 image published by Ultralytics, with modules required for ncnn conversion and execution installed, as well as environment configuration for running the aicap command. Therefore, basic detection programs using YOLO11 that run on AIBOX OS can be executed on containers launched from this image.

GitHub Dockerfile is here

The built-in AI detection program uses docker compose to start containers, so we place docker-compose.yml. The volumes / environment / network_mode specified in this file are explained below.

GitHub docker-compose.yml is here

volumes:
    - /usr/local/aicap:/usr/local/aicap
    - /home/cap/aicap:/home/cap/aicap
    - /var/www:/var/www
environment:
    MODEL_FILE_NAME: yolo11m_ncnn_model
    PREVIEW_IMAGE_PATH: /var/www/html/result.jpg
network_mode: host
            

volumes

The volumes parameter maps directories between the host and container. Here we specify three mappings.
The first two are necessary for running the aicap command and Python programs, so please configure them the same way when creating your own detection program.
The third mapping for the Nginx HTTP public directory is necessary when you want to view detection result images in a browser.

Mapping Source & DestinationDescription
/usr/local/aicapExecution directory for aicap command
/home/cap/aicapLocation for detection program files and various configuration files
/var/www Nginx HTTP public directory
Used to view detection result images in a browser by saving the detection results images
Important Note

By default, public access to the Nginx HTTP public directory is turned OFF.

environment

Environment variables set the following two values required for Python program (extmod.py) execution:

Environment Variable NameDescription
MODEL_FILE_NAMEName of the YOLO model to use
yolo11m_ncnn_model
PREVIEW_IMAGE_PATHSave location for detection result images
/var/www/result.jpg

network_mode: host

The aicap command accesses the capture program (cvc) to obtain camera frame images, but cvc is configured to only accept requests from localhost. Therefore, it is necessary to set network_mode to host to share the network between the host and container.

4

Detection Program Explanation

GitHub Full source code (extmod.py) is here

Function List

The object detection program (extmod.py) consists of the following four functions and a main function that calls them.

Of these, the get_frame function and push function can be reused as utility functions for running the aicap command from Python programs.

Function NameDescription
get_frame Executes the aicap get_frame command.
Frame images are retrieved in memory via standard output.
push Executes the aicap push command.
The image to be sent is passed via standard input as an argument, sending an image with red frames drawn for object detection.
create_result_jpeg Draws red frames for object detection on the camera frame image obtained by get_frame.
parse_results Formats the results from YOLO's predict method into an easy-to-use format.

Processing Flow

Looking at the main function, you will see it is a very simple program.

The processing flow is as follows, which repeats continuously:

[Get camera frame image] → [Run object detection with YOLO] → [Format results] → [Draw detection frame if detected] → [Send push notification] → [Save preview image]

About Push Images

The aicap push command can specify the image to send as an argument.

In this program, we pass a camera frame image with red frames from object detection results written to it as an argument to send.

5

Changing Detection Objects and Adjusting Accuracy

GitHub Full source code (extmod.py) is here

To change the objects to detect from people to other objects, or to adjust detection accuracy, modify the values of global variables in the program.

Changing Detection Objects

To change the objects to detect, modify the value of the CLASSES variable defined on line 51.


#
# Label IDs of objects to detect
# Multiple selections possible (specify as array)
# When using models provided by Yolo,
# specify the index of COCO (Common Objects in Context)
# (excerpt)
# ID    Class Name
# 0     person
# 1     bicycle
# 2     car
# 3     motorcycle
# 5     bus
# 16    dog
# 17    horse
# 18    sheep
# 19    cow
# 21    bear
# For example, to detect cars, buses, and motorcycles: CLASSES = [2, 3, 4]
#
CLASSES = [0]
            

As noted in the comments, you can change the objects to detect by modifying this value.
For example, if you want to detect bears, specify 21.


# 21    bear
CLASSES = [21]
            

You can also specify multiple objects. For example, to detect both cars and people together, write [0, 2].


# 0     person
# 2     car
CLASSES = [0, 2]
            

Adjusting Detection Accuracy

In actual operation, situations arise where there are false detections or missed detections.

In such cases, adjust the parameters of the object detection.

The parameters to adjust are the CONF variable defined on line 24 and the IOU variable defined on line 30.


# confidence threshold
# Detection confidence threshold (0.0 ~ 1.0)
# Detections with a confidence score below this threshold
# are not included in the results
CONF = 0.3

# Intersection over Union
# Overlap of detection boxes (0.0 ~ 1.0)
# YOLO may detect the same object with multiple candidate boxes in an image.
# This is the threshold to determine which boxes to retain as results
# = threshold to avoid duplicate detection
IOU = 0.5
            

You will mainly adjust the CONF parameter. Simply put, it is the detection threshold. Lowering it detects more objects but also increases false detections. Raising it reduces false detections but increases missed detections.

The optimal values vary depending on the clarity of the camera image and its installation location.

While monitoring the detection result information sent with push notifications, adjust the values so that unnecessary detections are eliminated.