Android Question Camera image data access overview

dara hayes

Member
Licensed User
Longtime User
I am interested in accessing the image from the camera, most of the applications shown on the forum shows files being saved as JPG or images sent by network methods to other platforms , I am interested in doing a very simple machine vision application where my desired target is a highly reflective tape which tends to saturate the camera once it is well lit , and I only want to check a quite small area of the image , not scan the entire image, can I do this action using the array of bytes provided in the Image Preview function of the camera library , if so how is the image organised and what format is the data in , and most importantly how many frames per second can I expect to be able to inspect given that the routine to scan for my desired target is not taking allot of time . Is this approach do-able using B4A or am I expecting more that can be achieved using this language on the Android Os,,
 

dara hayes

Member
Licensed User
Longtime User
The performance of B4A apps is similar to the performance of Java apps.

How many frames per second do you want to analyze?

Thanks Erel ,
I now see the posts on accessing the preview image , "How do I read YUV Image from Camera Preview"? (nov 2013 )explains allot of what I want to do , I am just not so sure how long start preview takes before an image is ready for examination in the buffer ?
I was hoping to scan four or five images per second but only examining small sub sets of the video memory , the precise area of interest is quite small in relation to the overall size of the image. Would five images per second be possible to start preview create the class to view and examine the data and restart the preview to grab another image ,five times a second ? would these sort of frame grab rates be possible in B4A on the Android OS.
I am not familiar with the YUV format , does one have to work with YUV format if you don't want to save the grabbed image to a file or can one convert the image data on the fly in the memory buffer to another more manageable format without physically saving it to a file on the machine ?
 
Upvote 0

dara hayes

Member
Licensed User
Longtime User
You can convert the YUV image to JPEG and then load it with CameraEx.PreviewImageToJpeg. Once you hold a Bitmap you can access its pixels.

I expect the performance to be good enough. See how the frames are handled in the CCTV example.
Many thanks for that Erel looking forward to having a play now .
 
Upvote 0

JordiCP

Expert
Licensed User
Longtime User
As you said you wanted to detect saturation in a certain area, perhaps this helps

You need to check pixels luminance (lightness? don't know if it is the correct word). The YUV format may be already suitable for your needs, since the first WxH (being W,H the preview resolution) bytes are precisely the pixels luminance (the rest have colour information)

Look here for a similar case.
 
Upvote 0

dara hayes

Member
Licensed User
Longtime User
Hola JordiCP , I have been reading your posts on this and having read Erels code on the CCTV example , that was going to be my very next question ??
what is wrong with just using whats in the buffer in YUV format . after one has grabbed an image, I would be most interested in learning how to handle this native YUV format , as all I want is a bright spot on the image , I have placed a piece of reflective tape on an object and simply hope to detect when its in view or not ? , its the silver tape one would normally have as stripes on a high visibility overalls. its actually called "SOLAS" TAPE its grey and even with a small amount of light on it , it gives almost saturation conditions for most of the CCD sensors on the market, I am expecting to discover the difference between this tape section of the image and the rest of my image to be very large in luminance or brightness . Thanks for your input I really want to use the raw YUV image and make my decisions on that.
 
Upvote 0

JordiCP

Expert
Licensed User
Longtime User
Hi dara,

In fact there is nothing wrong, both of them are good since they give you "nearly" the same information. If you just need to do your own processing based on luminance, dealing with YUV is more straightforward.
On the other hand, if you needed to send, save and/or use 3rd party libraries that do bitmap processing, perhaps you would need the other approach

Some things I would consider when processing directly YUV

  • Camera preview rate may vary from device to device even if you try to configure it. It may even change in a specific camera depending on the image itself, since some of the settings are set by default to "auto" (for instance, white balance) . If you need 5-6 fps, for instance, I would check the last time a preview event was processed (not fired), and only process next frame if (for instance) it comes 160msec later than previously processed one.
  • The byte array is 1.5*W*H bytes. The first W*H give you the lightness, so if you are only interested in it, there is no need for conversion (YUV -> JPEG -> save -> load BMP -> Get RGB pixels --> Get Y o_O)
  • Then comes the processing
    • These values are by nature unsigned , but byte format in B4A and Java is signed, so there must be a conversion
    • This array is given in the camera's own default orientation, which will be the same even if the device is rotated. So, it must be taken into account when getting the array values.
    • Even for a still image, individual pixels values may have "noise" and changing values, so it is a good approach to make decissions based on pixel's neighbourhood mean values rather than single values, and compare to a mean value of the rest of the zones.

Hope this helps!
 
Upvote 0

dara hayes

Member
Licensed User
Longtime User
That is all very helpful and relevant information ,many thanks , however there are a few things I dont understand firstly I know the YUV format is effectively three bytes per pixel , a single Luminance value and then two chroma values U and V per each pixel.
but when you say the first W*W is Y (Luma) I am not sure how the preview byte array is presented , is it an array with the first of three bytes is luma then U byte then V byte and the fourth byte is the next Luma value , or is it a complete frame (preview size) of luma bytes
Secondly when the preview image is stored in YUV is that an image at camera resolution size or is it a scaled down width and height from the cameras current frame size settings or is there a set rule for the scaling of a preview size image from the current camera frame size ?
 
Upvote 0

JordiCP

Expert
Licensed User
Longtime User
It is not exactly 3 bytes per pixel. To be exact, for each pixel there is Y,U and V component, but U and V information is "shared" between groups of 4 pixels

If your preview image is, for instance, W=12 H=8, the array will have (in this order)

Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
V U V U V U V U V U V U
V U V U V U V U V U V U
V U V U V U V U V U V U
V U V U V U V U V U V U
( 1.5*12*8 bytes = 144 bytes)

The V,U components are shared between more than one Y. For instance the first V,U pair is the chroma information for array[0],array[1],array[12] and array[13]. But in your case you don't need them, so can take directly the first W*H bytes


About the resolution, each device has a set of allowed preview resolutions and a set of allowed picture resolutions. If you want to process it in real-time, you only need to set the preview size. I would use the minimum one which is enough for your needs, or just subsample the array given in the preview event at N points each row and col.
 
Upvote 0

dara hayes

Member
Licensed User
Longtime User
It is not exactly 3 bytes per pixel. To be exact, for each pixel there is Y,U and V component, but U and V information is "shared" between groups of 4 pixels

If your preview image is, for instance, W=12 H=8, the array will have (in this order)

Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
V U V U V U V U V U V U
V U V U V U V U V U V U
V U V U V U V U V U V U
V U V U V U V U V U V U
( 1.5*12*8 bytes = 144 bytes)

The V,U components are shared between more than one Y. For instance the first V,U pair is the chroma information for array[0],array[1],array[12] and array[13]. But in your case you don't need them, so can take directly the first W*H bytes


About the resolution, each device has a set of allowed preview resolutions and a set of allowed picture resolutions. If you want to process it in real-time, you only need to set the preview size. I would use the minimum one which is enough for your needs, or just subsample the array given in the preview event at N points each row and col.

Ok thank you for explaining that so well , thats perfectly clear now so I must interrogate my device for its picture and preview resolutions and choose a best fit preview size
It is not exactly 3 bytes per pixel. To be exact, for each pixel there is Y,U and V component, but U and V information is "shared" between groups of 4 pixels

If your preview image is, for instance, W=12 H=8, the array will have (in this order)

Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
V U V U V U V U V U V U
V U V U V U V U V U V U
V U V U V U V U V U V U
V U V U V U V U V U V U
( 1.5*12*8 bytes = 144 bytes)

The V,U components are shared between more than one Y. For instance the first V,U pair is the chroma information for array[0],array[1],array[12] and array[13]. But in your case you don't need them, so can take directly the first W*H bytes


About the resolution, each device has a set of allowed preview resolutions and a set of allowed picture resolutions. If you want to process it in real-time, you only need to set the preview size. I would use the minimum one which is enough for your needs, or just subsample the array given in the preview event at N points each row and col.


Ok thanks for explaining that so clearly , its perfectly clear I must interrogate the device to find its picture sizes and its preview sizes and set these according to my requirements . thanks for that
 
Upvote 0
Top