Flutter Yolo11n Sample

flutter

Flutter Yolo11n Sample

cornpip

|2025. 12. 13. 01:56

Flutter YOLO11n Sample Repository

https://github.com/cornpip/flutter_vision_ai_demos

GitHub - cornpip/flutter_vision_ai_demos: camera-based AI vision demos (YOLO, MediaPipe Face)

camera-based AI vision demos (YOLO, MediaPipe Face) - cornpip/flutter_vision_ai_demos

github.com

몇 가지 포인트를 살펴보자.

YOLO11 To TFLite

https://github.com/cornpip/pt_to_tflite

sample에서는 yolo11n.pt를 tflite로 변환하였다.

yolo11_to_tflite.py

ultralytics/ultralytics:latest 이미지에서 실행할 수 있다.

yolo11에선 tflite로 모델을 내보내는 메서드를 제공한다.

https://docs.ultralytics.com/ko/integrations/tflite/

Camera Plugin

camera plugin을 사용하고 `await controller.startImageStream(_processCameraImage);`

이런식으로 카메라 스트림에 프로세스를 끼워넣는다.

기본적으로 프레임을 큐에 누적하지 않고, 이전 프레임을 드롭하면서 항상 최신 프레임만 처리하도록 동작한다.

`CameraPreview`도 plugin 에서 지원한다.

다만 전면 카메라의 경우, 익숙한 셀카 화면을 위해서는 좌우 반전(미러링) 처리가 필요하다.

프리뷰의 좌우 반전은 `Matrix4.identity`를 이용해 적용할 수 있다.

프리뷰에 박스를 오버레이하려면, 추론(predict)에 입력되는 이미지 역시 동일하게 좌우 반전된 상태로 처리하는 편이 좋다.

Isolate

플러터에서 UI 쓰레드가 블로킹되지 않으려면, 추론이나 전처리 같은 무거운 작업은 별도 Isolate에서 처리하는 것이 필요하다.

이를 위해 이전 포스팅에서 살펴본 FFI Plugin 프로젝트 구조를 활용한다.

    // main isolate
    _inferenceIsolate = await Isolate.spawn(
      _yoloIsolateEntry,
      initArgs,
      errorsAreFatal: true,
    );
    _inferenceSendPort = await readyPort.first as SendPort;
    readyPort.close();

    // _yoloIsolateEntry (별도 isolate)
    final SendPort readyPort = initialMessage[0] as SendPort;
    ...
    final receivePort = ReceivePort();
    readyPort.send(receivePort.sendPort);
    ...
    await for (final dynamic message in receivePort) {
      ...
      final command = message[0];
      if (command == 'predict') {
        final payload =
            (message[1] as Map<dynamic, dynamic>).cast<String, dynamic>();
        final SendPort replyTo = message[2] as SendPort;
        try {
          final detections = handler.predict(payload);
          replyTo.send(detections);
        } catch (error, stackTrace) {
          replyTo.send({
            'error': error.toString(),
            'stackTrace': stackTrace.toString(),
          });
        }
      }
      ...
    }

main에서 분리한 `_yoloIsolateEntry` Isolate를 생성한다.

`_yoloIsolateEntry`는 먼저 전달받은 readyPort(초기화 핸드셰이크용) 에 `_yoloIsolateEntry` 내부 receivePort를 전달한다.

main에서 ` _yoloIsolateEntry `의 receivePort를 받으면 readyPort(초기화 핸드셰이크용)는 close한다.

`_yoloIsolateEntry`는 별도 isolate에서 receivePort message를 대기한다.

    final sendPort = _inferenceSendPort;
    sendPort.send([
      'predict',
      _serializeCameraImage(
        cameraImage,
        sensorOrientation,
        lensDirection,
      ),
      responsePort.sendPort,
    ]);
    
    final dynamic result = await responsePort.first;
    responsePort.close();

`_yoloIsolateEntry`에 responsePort와 함께 predict 요청을 보내고 await 한다.

`_yoloIsolateEntry`는 predict 프로세스를 진행하고 결과를 responsePort로 보낸다.

TFLite Predict

전처리, FFI(C++/OpenCV), inputTensor

`CamerImage` → `_CameraFrameData` → `_preprocess`(FFI 포함) → inputTensor

    // _serializeCameraImage
    final planes = cameraImage.planes
        .map(
          (plane) => {
            'bytesPerRow': plane.bytesPerRow,
            'bytesPerPixel': plane.bytesPerPixel,
            'bytes': TransferableTypedData.fromList(
              [Uint8List.fromList(plane.bytes)],
            ),
          },
        )
        .toList();

isolate 간 메시지 전달에서 SendPort로 보내면 데이터가 복사되는데,

TransferableTypedData를 감싸면 복사하지 않고 그대로 "양도" 할 수 있다.

큰 Uint8List 같은 데이터는 성능이나 효율 측면에서 TransferableTypedData로 감쌌다.

`_CameraFrameData` 까지 YUV420(android) or NV12(ios) 포맷 데이터이고

`_preprocess` FFI 호출에서 회전·좌우반전·리사이즈·레터박스를 한 번에 처리하고 RGB 바이트 배열을 반환한다.

Yolo11 모델은 입력 이미지에 letterbox resize(원본 비율 유지 + 남는 영역 패딩)를 기본으로 사용한다.

TFLite 모델을 사용할 때, In/Out 이미지 처리를 모델 학습에서 사용한 방법 그대로 구현해주는게 아주 중요하다.

` _preprocess`를 거쳐 나온 1차원 RGB 바이트 배열은 모델의 input size에 맞춰 구조를 정렬한다.

TFLite는 NHWC Shape을 기대한다. ex) [1, 640, 640, 3]

OutputTensor, 후처리

YOLO11 모델의 Output shape은 [1, 84, 8400] 이다.

outputBuffer도 동일한 구조로 0으로 초기화한 뒤 `_interpreter.run(inputTensor, outputBuffer);` predict을 진행한다.

1 = 배치 크기. 한 번 추론에 한 장의 이미지를 처리한 결과
84 = 각 감지 후보(anchor-free cell)마다 제공되는 값의 수. 일반적으로 앞쪽 4개는 박스 좌표(x, y, w, h)이고, 나머지 80개는 COCO 기준 클래스별 확률.
8400 = 이미지 전체에서 추출한 감지 후보의 개수.

`_flattenTensor` : Interpreter.run이 채워 넣은 outputBuffer를 한 번에 순회하기 쉽도록 1차원 Float32List로 펴준다.

`_decodeDetections` : 평탄화된 버퍼를 YOLO 포맷에 맞춰 해석한다.

각 박스마다 xCenter, yCenter, width, height, objectness, class logits 등을 읽고, sigmoid를 적용해 클래스 확률을 얻은 뒤 objectness와 곱해 confidence를 계산한다. 기준보다 낮으면 버리고, 남은 것들은 _buildBoundingBox로 레터박스 보정/정규화를 거쳐 Detection으로 변환한다. 마지막에는 _nonMaxSuppression(NMS)를 돌려 IoU가 높은 중복 박스를 제거한다.

Delegate

    _interpreter = Interpreter.fromBuffer(
      modelBytes,
      options: interpreterOptions,
    );
    // interpreterOptions.addDelegate(GpuDelegateV2());
    // interpreterOptions.addDelegate(XNNPackDelegate());
    // interpreterOptions.useNnApiForAndroid = true;

TFLite 추론 실행 시 사용할 백엔드(delegate)를 선택할 수 있으며, 기본은 CPU로 동작한다.

XNNPack : TensorFlow Lite의 CPU 최적화 백엔드

ARM NEON, x86 SIMD 등 CPU 벡터 명령어와 멀티스레딩을 활용해 CPU 기반 추론 성능을 향상시킨다.

별도 하드웨어 요구 없음
대부분의 연산(op)을 안정적으로 지원
작은 모델 및 INT8/FP32 모델에서 효율적
플랫폼 간 일관된 성능 제공

GpuDelegateV2 : GPU 백엔드

Conv, MatMul 등 병렬 연산을 GPU로 오프로드하여 CNN 계열 모델의 추론 속도를 크게 향상시킨다.

OpenGL ES / Vulkan (Android), Metal (iOS) 기반
FP16 연산 지원으로 메모리 및 연산 효율 향상
실시간 이미지·비전 모델에 적합
일부 연산은 GPU 미지원 시 CPU로 fallback

NNAPI : 안드로이드 Neural Networks API 백엔드

Neural Networks API를 통해 추론을 디바이스의 하드웨어 가속기(NPU, DSP, GPU 등)에 위임한다.

Android OS가 실제 실행 하드웨어를 결정(NPU, DSP, GPU 등 OS가 결정)
INT8 양자화 모델에서 효과적인 경우가 많음
기기 및 제조사별 성능 편차가 큼
미지원 연산은 CPU reference kernel로 fallback

iOS 지원

    final rotationDegrees = Platform.isIOS
        ? 0 // iOS preview is already oriented; avoid double rotation on boxes
        : controller.description.sensorOrientation;

iOS는 CameraImage에서 얻는 이미지 포맷(NV21)이 다르고, 그에 따른 처리를 해준다.

또한 회전이 적용된 이미지를 획득하길래 rotation 처리를 제외했다.

https://opencv.org/releases/

FFI Plugin에 OpenCV iOS SDK도 포함해 올리려 했으나, opencv 바이너리 파일 사이즈가 500MB이 넘는다.(기본 깃헙은 100MB 넘으면 push가 거부)

필요 아키텍처와 모듈만 골라서 OpenCV를 별도 빌드한다면 작게 만들 수 있을 것이다.

그래서 현재 iOS용 SDK를 포함하지 않았으며, iOS에서 yolo sample 빌드시 다음 가이드를 따르길 바란다.

1. ffi_plugin project clone 및 OpenCV iOS SDK 위치시키기

2. yolo sample pubspec.yaml에서 ffi_plugin_look dependencies 로컬 위치로 수정

ex)
  ffi_plugin_look:
#    git:
#      url: https://github.com/cornpip/ffi_plugin_test.git
#      ref: master
    path: ../ffi_plugin_look

'flutter' 카테고리의 다른 글

Flutter 패키지 개발: MediaPipe Face Mesh (0)	2026.01.01
Flutter FFI Plugin Project (0)	2025.11.26
Flutter ML Kit FaceMesh 실시간 얼굴 인식 예제 (0)	2025.10.24
Flutter Camera ImageStream Issue: The Previous frames remained (0)	2025.08.12
Flutter Camera Preview Blank - Android10 (0)	2025.08.04

Flutter Yolo11n Sample

Flutter YOLO11n Sample Repository

YOLO11 To TFLite

Camera Plugin

Isolate

TFLite Predict

전처리, FFI(C++/OpenCV), inputTensor

OutputTensor, 후처리

Delegate

iOS 지원

'flutter' 카테고리의 다른 글

티스토리툴바