Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[camera] Support image streams on Windows platform #97542

Open
jokerttu opened this issue Jan 31, 2022 · 27 comments · Fixed by flutter/packages#7067 · May be fixed by flutter/packages#8234
Open

[camera] Support image streams on Windows platform #97542

jokerttu opened this issue Jan 31, 2022 · 27 comments · Fixed by flutter/packages#7067 · May be fixed by flutter/packages#8234
Labels
a: desktop Running on desktop c: new feature Nothing broken; request for a new capability has partial patch There is a PR awaiting someone to take it across the finish line p: camera The camera plugin P3 Issues that are less important to the Flutter project package flutter/packages repository. See also p: labels. platform-windows Building on or for Windows specifically team-windows Owned by the Windows platform team triaged-windows Triaged by the Windows platform team

Comments

@jokerttu
Copy link

Use case

#4641 adds support for camera on Windows platform but is missing implementation for CameraController feature:

  • ImageStream

Proposal

Camera Windows plugin should have support for these camera plugin methods:

  • startImageStream
  • stopImageStream

Plugin should send images over the MethodChannel. Images are already handled as raw format on windows camera plugin platform, and sending them over the channel should not be a big task to implement.

@stuartmorgan stuartmorgan added a: desktop Running on desktop p: camera The camera plugin p: first party platform-windows Building on or for Windows specifically labels Jan 31, 2022
@gspencergoog gspencergoog added the P3 Issues that are less important to the Flutter project label Feb 3, 2022
@robinduerhager
Copy link

Hey, just wanted to ask if there is a date or something when this will be available? I think Capturing image data and processing it might be one of the main use cases of the camera module :). At least we highly need that 😄 .

@quietxu
Copy link

quietxu commented May 12, 2022

not have EventChannel on windows desktop?

@robinduerhager
Copy link

@quietxu There is actually an EventChannel class in flutter/event_channel.h.

@quietxu
Copy link

quietxu commented May 28, 2022

@robinduerhager Thanks
I also need camera stream data in the project, I don't send back flutter, but do image processing on Windows, through c++

@postacik
Copy link

postacik commented Jun 3, 2022

It would be great to have this feature :)

@zof1985
Copy link

zof1985 commented Jun 26, 2022

This feature would be really helpful.

@robinduerhager
Copy link

Hey @jokerttu , since my last comment is 6 months old and i really need this feature, i wanted to check if you could get us a little heads up about when you think this will be implemented :)?

I would like to help out with this, but i'm not experienced in C++, Though maybe getting pointed in the right direction could lead to something? 🤔

@Policy56
Copy link

That's a needed feature.
Image stream on Windows platform can help of text recognition or barcode scanner on Windows app.

Following this issue !

@sportmachine
Copy link

I can help on C++ or Dart. need to pass onto the OS a flag.

@flutter-triage-bot flutter-triage-bot bot added the package flutter/packages repository. See also p: labels. label Jul 5, 2023
@Hixie Hixie removed the plugin label Jul 6, 2023
@jonatandorozco
Copy link

Any update on this?

@stuartmorgan stuartmorgan added the c: new feature Nothing broken; request for a new capability label Sep 11, 2023
@jlundlumination
Copy link

I setup a basic working prototype of this, my c++ is very rusty and the doco for windows Method channels is practically non existent but I was able to get it working by sending frames through to a method channel I have passed into the texture handler.

Here's some code, if it helps anyone. It needs to be converted to use an EventChannel for the image stream and I haven't implemented stopImageStream either. If i get some more time I'll try and come back to implement them.

https://github.com/LuminationDev/camera_windows

if you use this with the camera library you will need to comment out the assertion on line 461 of the camera_controller.dart

@bawahakim
Copy link

bawahakim commented Feb 7, 2024

@jlundlumination Tried it out and kinda works out of the box! However, if I try to save the image using the image package with this

final image = img.Image.fromBytes(
          width: width,
          height: height,
          bytes: cameraImage.planes[0].bytes.buffer,
        );
final List<int> png = img.encodePng(image);

I get a weird image out.

frame_1707287891527

I have little experience in image manipulation, but I assume that it's something to do with the encoding/format. Had a back and forth with ChatGPT, and ended up doing this, which at least gets rid of the grid-like picture, but still messes the color. Apparently the bgra8888 format would not expect to have more than 1 plane, in this case there are 4 planes, which apparently conforms more to YUV420 format.

        final width = cameraImage.width;
        final height = cameraImage.height;

        final yPlane = cameraImage.planes[0].bytes;
        final uPlane = cameraImage.planes[1].bytes;
        final vPlane = cameraImage.planes[2].bytes;

        final rgbImage = Uint8List(width * height * 3); // For RGB, * 4 for RGBA

        for (var y = 0; y < height; y++) {
          for (var x = 0; x < width; x++) {
            final yIndex = x + y * width;
            final uvIndex = (x ~/ 2) +
                (y ~/ 2) * (width ~/ 2); // Adjust based on subsampling

            // Simple YUV to RGB conversion formula
            final Y = yPlane[yIndex];
            final U = uPlane[uvIndex] - 128;
            final V = vPlane[uvIndex] - 128;

            final R = (Y + (1.402 * V)).round().clamp(0, 255);
            final G =
                (Y - (0.344136 * U) - (0.698001 * V)).round().clamp(0, 255);
            final B = (Y + (1.772 * U)).round().clamp(0, 255);

            final rgbIndex = yIndex * 3;
            rgbImage[rgbIndex] = R;
            rgbImage[rgbIndex + 1] = G;
            rgbImage[rgbIndex + 2] = B;
          }
        }

frame_1707288125052

Not sure where to go from there. I've also tried setting the imageFormatGroup in the CameraController to nv21 and yuv420, but same result, and the resulting format is always bgra8888.

Appreciate the efforts you've put to get us there!

@jlundlumination
Copy link

jlundlumination commented Feb 7, 2024

@bawahakim The image is structured into 4 planes r,g,b and a(always 255, not sure why i send it). For our use case we needed the seperate pixel arrays and so it was going to be redundant to seperate them later. If you want to convert it to an image you can use:

img.Image WinConvertRGBToImage(CameraImage cameraImage, Rect rect) {
  final width = cameraImage.width;
  final height = cameraImage.height;
  final image = img.Image(
      width: rect.right.round() - rect.left.round(),
      height: rect.bottom.round() - rect.top.round());
  Plane rplane = cameraImage.planes[0];
  Plane gplane = cameraImage.planes[1];
  Plane bplane = cameraImage.planes[2];
  for (var h = rect.top.round(); h < rect.bottom.round(); h++) {
    for (var w = rect.left.round(); w < rect.right.round(); w++) {
      int rows = h - 1;
      int index = (rows * width) + w;
      image.data?.setPixel(
          w - rect.left.round(),
          h - rect.top.round(),
          img.ColorRgb8(
              rplane.bytes[index], gplane.bytes[index], bplane.bytes[index]));
    }
  }
  return image;
}

or alternatively you could have a look at https://github.com/LuminationDev/camera_windows/blob/b1095c53b65456dc5b1cbd343c76a4f933ce2323/windows/texture_handler.cpp#L101C1-L102C1 you should be able to interlace the pixel bytes in brga here on the windows side, put it all in the r array,
and then just adjust the bytes per row here on the flutter side https://github.com/LuminationDev/camera_windows/blob/b1095c53b65456dc5b1cbd343c76a4f933ce2323/lib/camera_windows.dart#L315

@bawahakim
Copy link

@jlundlumination Amazing, that works perfectly! Managed to do it a bit simpler with an image rather than a rect. Unsure of performance but in our case it's primarily for dev purposes.

Uint8List _processRgbImage(CameraImage cameraImage) {
    final width = cameraImage.width;
    final height = cameraImage.height;

    final rplane = cameraImage.planes[0].bytes;
    final gplane = cameraImage.planes[1].bytes;
    final bplane = cameraImage.planes[2].bytes;

    final image = img.Image(width: width, height: height);

    for (var y = 0; y < height; y++) {
      for (var x = 0; x < width; x++) {
        final index = (y * width) + x;
        image.setPixelRgba(
          x,
          y,
          rplane[index],
          gplane[index],
          bplane[index],
          255,
        );
      }
    }

    return image.getBytes();
  }

If ever you get around to implementing sopping the stream, please let me know. Thanks again!

@jlundlumination
Copy link

Forgot i had left the rect in there, it allows cropping without needing to process any of the additional data outside the crop.

@lhzmrl
Copy link

lhzmrl commented Mar 2, 2024

This is a useful feature, which will help developers develop more creative programs, and of course, achieve the goal of Flutter “Build for any screen”.

@thiagotognoli
Copy link

That's a needed feature.

@carmanhani
Copy link

Having this feature would be fantastic! :)

@cbracken cbracken added team-windows Owned by the Windows platform team and removed team-desktop labels Jun 6, 2024
@flutter-triage-bot
Copy link

The triaged-desktop label is irrelevant if there is no team-desktop label or fyi-desktop label.

@jlundOverlay
Copy link

jlundOverlay commented Jul 8, 2024

@bawahakim I've had another shot at implementing this, it's a bit more refined this time and should in theory stop the stream. https://github.com/overlay-ai-pty-ltd/packages/tree/main

the camera data is all contained inside of a bgra plane now however so formatting will be slightly different

@robinduerhager
Copy link

Hey @jlundOverlay, thank you for your work and the PR :).

I do have a question / proposal regarding your implementation: I was wondering why you rely on the preview handler for fetching the image data instead of the record handler. In capture_controller.cpp on line 476 to 491, the preview resolution will be modified by the PlatformResolutionPreset, while the record handler will always try to get the max resolution from the Media Foundation Platform:

HRESULT CaptureControllerImpl::FindBaseMediaTypes() {
  // [...]
  if (!FindBestMediaType(
          (DWORD)MF_CAPTURE_ENGINE_PREFERRED_SOURCE_STREAM_FOR_VIDEO_PREVIEW,
          source.Get(), base_preview_media_type_.GetAddressOf(),
          GetMaxPreviewHeight(), &preview_frame_width_,
          &preview_frame_height_)) {
    return E_FAIL;
  }

  // Find base media type for record and photo capture.
  if (!FindBestMediaType(
          (DWORD)MF_CAPTURE_ENGINE_PREFERRED_SOURCE_STREAM_FOR_VIDEO_RECORD,
          source.Get(), base_capture_media_type_.GetAddressOf(), 0xffffffff,
          nullptr, nullptr)) {
    return E_FAIL;
  }
  // [...]
}

// Used only by the Preview Handler Media Type
// Record Handler Media Type will always be 0xffffffff for the max_height parameter of the FindBestMediaType function
uint32_t CaptureControllerImpl::GetMaxPreviewHeight() const {
  switch (media_settings_.resolution_preset()) {
    case PlatformResolutionPreset::low:
      return 240;
    case PlatformResolutionPreset::medium:
      return 480;
    case PlatformResolutionPreset::high:
      return 720;
    case PlatformResolutionPreset::veryHigh:
      return 1080;
    case PlatformResolutionPreset::ultraHigh:
      return 2160;
    case PlatformResolutionPreset::max:
    default:
      // no limit.
      return 0xffffffff;
  }
}

// Signature of FindBestMediaType
bool FindBestMediaType(DWORD source_stream_index, IMFCaptureSource* source,
                       IMFMediaType** target_media_type, uint32_t max_height,
                       uint32_t* target_frame_width,
                       uint32_t* target_frame_height,
                       float minimum_accepted_framerate = 15.f)

Also, I was wondering if we would run into issues later, if we would like to implement other Image Format Groups. In preview_handler.cpp, on Line 32, the Video Format will be RGB32 and i'm wondering if something like NV12 or YUV420 would work as well in this context or if we would have to separate the Preview recording from the stream recording so to say?

What do you think? :)

(Little disclaimer: I had to learn C++ and fix this for our application aswell but sadly never found the time to commit a PR. I changed the record handler to save to a path or stream the content to flutter. That said, i am by no means an experienced C++ Programmer :D).

@jlundOverlay
Copy link

jlundOverlay commented Jul 16, 2024

Little disclaimer: I had to learn C++ and fix this for our application

Sounds like we are in the same boat😅

I went to have a look at camera_android to see what they do. The implementation is a bit different but it looks as though its correct for the image stream to take on the preview resolution.

	// For image streaming, use the provided image format or fall back to YUV420.
    Integer imageFormat = supportedImageFormats.get(imageFormatGroup);
    if (imageFormat == null) {
      Log.w(TAG, "The selected imageFormatGroup is not supported by Android. Defaulting to yuv420");
      imageFormat = ImageFormat.YUV_420_888;
    }
    imageStreamReader =
        new ImageStreamReader(
            resolutionFeature.getPreviewSize().getWidth(),
            resolutionFeature.getPreviewSize().getHeight(),
            imageFormat,
            1);

I'm not sure about the Image format potentially the image streaming needs to be seperated into a stream_handler class(replicating the record_handler) so it can set its own formatting. @cbracken or @jokerttu any input on this?

@robinduerhager
Copy link

robinduerhager commented Jul 17, 2024

Sounds like we are in the same boat😅

Haha, it's a relief to know that 😂!

I went to have a look at camera_android to see what they do. The implementation is a bit different but it looks as though its correct for the image stream to take on the preview resolution.

I did a bit more research in this topic and must say you're right with using the preview handler. According to the MF_CAPTURE_ENGINE_SINK_TYPE Enumeration, the preview sink is for live video / immediate display. I guess the record sink is less performant for streaming image data since it also has to encode the frames in e.g. H264, while the preview sink directly streams "raw" RGB / YUV pixel data without the extra encoding layer. You can read something similar from the Remarks Section of the IMFCaptureSink Site. In the IMFCaptureSink::AddStream Parameters Section it is a little more clear about the preview stream giving uncompressed video (and audio).

I'm not sure about the Image format potentially the image streaming needs to be seperated into a stream_handler class(replicating the record_handler) so it can set its own formatting. @cbracken or @jokerttu any input on this?

Really interested in this answer as well 😅. Maybe just adding another output stream (if that's possible) to the preview sink or just transforming the RGB values to a desired format for the flutter stream would be the fastest and KISS way to get another different output but using an extra stream_handler sounds nicer in terms of maintainability and separation of concerns.

Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of flutter doctor -v and a minimal reproduction of the issue.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 10, 2024
@stuartmorgan stuartmorgan reopened this Oct 29, 2024
@flutter flutter unlocked this conversation Oct 29, 2024
@stuartmorgan
Copy link
Contributor

Re-opening, as this never fully landed (the first PR of two was accidentally set to auto-close this issue). The implementation was never completed, and what did land is being reverted due to issues discovered after it landed. If anyone is interested in picking up this work, flutter/packages#7067 and flutter/packages#7220 would be a starting point, but the issues raised in both PRs (notably, here, here, and here) would need to be addressed.

@stuartmorgan stuartmorgan added the has partial patch There is a PR awaiting someone to take it across the finish line label Oct 30, 2024
@liff
Copy link

liff commented Nov 27, 2024

Is anyone planning to continue or already working on this? I’m looking into continuing from the implementation in flutter/packages#7067 and fixing the issues that were raised in flutter/packages#7951.

cc: @jlundOverlay

@jlundOverlay
Copy link

That would be great, the company time i had to complete this in ran out, I always meant to come back to it personally but never found the time😅, let me know if there's any way i can help.

@liff liff linked a pull request Dec 5, 2024 that will close this issue
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a: desktop Running on desktop c: new feature Nothing broken; request for a new capability has partial patch There is a PR awaiting someone to take it across the finish line p: camera The camera plugin P3 Issues that are less important to the Flutter project package flutter/packages repository. See also p: labels. platform-windows Building on or for Windows specifically team-windows Owned by the Windows platform team triaged-windows Triaged by the Windows platform team
Projects
None yet