Built-in AI Detection Program

1

Introduction

AIBOX OS comes with a built-in AI object detection program configured to detect people.
This page explains this program and provides instructions on how to modify it to detect other objects such as animals or vehicles.

Please make sure you are logged into AIBOX OS and have the ability to view and edit files.

Setup Instructions

2

File Configuration List

AIBOX has a location where the detection program can be placed as follows:

Installation Location
/home/cap/aicap/extmod

Hint

The location under /home is not configured as a RAM disk, so there is no need to disable the RAM disk before making changes.

An object detection program using YOLO11 is installed here by default.

For more information about YOLO11, see here

The files provided are as follows:

Available on GitHub

File Name	Description
start.sh	Shell script called when the main AIBOX program starts
stop.sh	Shell script called when the main AIBOX program terminates
extmod.py	Python program that performs object detection
yolo11m_ncnn_model	YOLO11 model (yolo11m.pt) exported in NCNN format For more information about NCNN Export in YOLO11, see here
docker-compose.yml	File containing Docker container configuration and settings

start.sh/stop.sh contain the necessary processing for automatic execution and termination of the detection program. In this program, the processing to start/stop containers using docker compose is written.

Object detection is performed in Python using YOLO11, but instead of using the standard model, we use a lightweight and faster NCNN format conversion.

3

About Docker

AIBOX OS comes with a Docker Image pre-registered that can run YOLO11 with NCNN.

docker images

REPOSITORY                TAG              SIZE
aicap/arm64/ultralytics   1.0.250923      3.43GB
                

This image is based on the YOLO11 image published by Ultralytics, with modules required for ncnn conversion and execution installed, as well as environment configuration for running the aicap command. Therefore, basic detection programs using YOLO11 that run on AIBOX OS can be executed on containers launched from this image.

Dockerfile is here

The built-in AI detection program uses docker compose to start containers, so we place docker-compose.yml. The volumes / environment / network_mode specified in this file are explained below.

docker-compose.yml is here


volumes:
    - /usr/local/aicap:/usr/local/aicap
    - /home/cap/aicap:/home/cap/aicap
    - /var/www:/var/www
environment:
    MODEL_FILE_NAME: yolo11m_ncnn_model
    PREVIEW_IMAGE_PATH: /var/www/html/result.jpg
network_mode: host

volumes

The volumes parameter maps directories between the host and container. Here we specify three mappings.
The first two are necessary for running the aicap command and Python programs, so please configure them the same way when creating your own detection program.
The third mapping for the Nginx HTTP public directory is necessary when you want to view detection result images in a browser.

Mapping Source & Destination	Description
/usr/local/aicap	Execution directory for aicap command
/home/cap/aicap	Location for detection program files and various configuration files
/var/www	Nginx HTTP public directory Used to view detection result images in a browser by saving the detection results images

Important Note

By default, public access to the Nginx HTTP public directory is turned OFF.

environment

Environment variables set the following two values required for Python program (extmod.py) execution:

Environment Variable Name	Description
MODEL_FILE_NAME	Name of the YOLO model to use yolo11m_ncnn_model
PREVIEW_IMAGE_PATH	Save location for detection result images /var/www/result.jpg

network_mode: host

The aicap command accesses the capture program (cvc) to obtain camera frame images, but cvc is configured to only accept requests from localhost. Therefore, it is necessary to set network_mode to host to share the network between the host and container.

4

Detection Program Explanation

Full source code (extmod.py) is here

Function List

The object detection program (extmod.py) consists of the following four functions and a main function that calls them.

Of these, the get_frame function and push function can be reused as utility functions for running the aicap command from Python programs.

Function Name	Description
get_frame	Executes the aicap get_frame command. Frame images are retrieved in memory via standard output.
push	Executes the aicap push command. The image to be sent is passed via standard input as an argument, sending an image with red frames drawn for object detection.
create_result_jpeg	Draws red frames for object detection on the camera frame image obtained by get_frame.
parse_results	Formats the results from YOLO's predict method into an easy-to-use format.

Processing Flow

Looking at the main function, you will see it is a very simple program.

The processing flow is as follows, which repeats continuously:

[Get camera frame image] → [Run object detection with YOLO] → [Format results] → [Draw detection frame if detected] → [Send push notification] → [Save preview image]

About Push Images

The aicap push command can specify the image to send as an argument.

In this program, we pass a camera frame image with red frames from object detection results written to it as an argument to send.

Details about aicap command

5

Changing Detection Objects and Adjusting Accuracy

Full source code (extmod.py) is here

To change the objects to detect from people to other objects, or to adjust detection accuracy, modify the values of global variables in the program.

Changing Detection Objects

To change the objects to detect, modify the value of the CLASSES variable defined on line 51.


#
# Label IDs of objects to detect
# Multiple selections possible (specify as array)
# When using models provided by Yolo,
# specify the index of COCO (Common Objects in Context)
# (excerpt)
# ID    Class Name
# 0     person
# 1     bicycle
# 2     car
# 3     motorcycle
# 5     bus
# 16    dog
# 17    horse
# 18    sheep
# 19    cow
# 21    bear
# For example, to detect cars, buses, and motorcycles: CLASSES = [2, 3, 4]
#
CLASSES = [0]

As noted in the comments, you can change the objects to detect by modifying this value.
For example, if you want to detect bears, specify 21.


# 21    bear
CLASSES = [21]

You can also specify multiple objects. For example, to detect both cars and people together, write [0, 2].


# 0     person
# 2     car
CLASSES = [0, 2]

Adjusting Detection Accuracy

In actual operation, situations arise where there are false detections or missed detections.

In such cases, adjust the parameters of the object detection.

The parameters to adjust are the CONF variable defined on line 24 and the IOU variable defined on line 30.


# confidence threshold
# Detection confidence threshold (0.0 ~ 1.0)
# Detections with a confidence score below this threshold
# are not included in the results
CONF = 0.3

# Intersection over Union
# Overlap of detection boxes (0.0 ~ 1.0)
# YOLO may detect the same object with multiple candidate boxes in an image.
# This is the threshold to determine which boxes to retain as results
# = threshold to avoid duplicate detection
IOU = 0.5

You will mainly adjust the CONF parameter. Simply put, it is the detection threshold. Lowering it detects more objects but also increases false detections. Raising it reduces false detections but increases missed detections.

The optimal values vary depending on the clarity of the camera image and its installation location.

While monitoring the detection result information sent with push notifications, adjust the values so that unnecessary detections are eliminated.