-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add background segmentation mask #142
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1688,5 +1688,57 @@ <h2>MediaStream in workers</h2> | |
};</pre> | ||
</div> | ||
</section> | ||
<section> | ||
<h2>Background segmentation mask</h2> | ||
<p>Some platforms or User Agents may provide built-in support for background segmentation of video frames, in particular for camera video streams. | ||
Web applications may want to control whether background segmentation is computed at the source level and to get access to the computed segmentation masks. | ||
This allows the web application for instance | ||
to do custom framing or background blurring or replacement | ||
while leveraging on platform computed background segmentation. | ||
This allows the web application | ||
to access the original unmodified frame and | ||
to fine tune frame modifications based on its likings. | ||
For that reason, we extend {{MediaStreamTrack}} with the following properties and {{VideoFrame}} with the following attributes. | ||
</p> | ||
<pre class="idl"> | ||
partial dictionary MediaTrackSupportedConstraints { | ||
boolean backgroundSegmentationMask = true; | ||
}; | ||
|
||
partial dictionary MediaTrackConstraintSet { | ||
ConstrainBoolean backgroundSegmentationMask; | ||
}; | ||
|
||
partial dictionary MediaTrackSettings { | ||
boolean backgroundSegmentationMask; | ||
}; | ||
|
||
partial dictionary MediaTrackCapabilities { | ||
sequence<boolean> backgroundSegmentationMask; | ||
};</pre> | ||
<section> | ||
<h3>VideoFrame interface extensions</h3> | ||
<pre class="idl"> | ||
partial interface VideoFrame { | ||
readonly attribute VideoFrame? backgroundSegmentationMask; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I imagine this isn't going to suffer infinite recursion because the second layer deep will be guaranteed nullable. But it still strikes me as a bit odd to expose a full There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, recursion is definitely not wanted. While I by no mean insist on Additionally, because usages of background segmentation masks are manifold (it could be post-processed remotely, locally on CPU or on GPU, etc.) and sources and pre-processing could vary (maybe the source is a boolean matrix or an integer matrix or a GPU texture), it would be good IMHO if the API didn't enforce a particular storage or representation. A There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is the attribute readonly? If JS wishes to modify the background segmentation mask of a frame, how can you do it? Create a new video frame with a new segmentation mask member? How is that passed to the video frame constructor? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note VideoFrame is defined by the Media WG, so I think this needs to be discussed there. Unless we make
These are good questions I suspect the Media WG can answer. They made VideoFrame and its metadata immutable and define its interaction model. Like @eladalon1983 I find it odd to expose a full There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I made |
||
};</pre> | ||
<section> | ||
<h4>Attributes</h4> | ||
<dl data-link-for="VideoFrame" data-dfn-for="VideoFrame" class="attributes"> | ||
<dt><dfn><code>backgroundSegmentationMask</code></dfn> | ||
of type | ||
<span class="idlAttrType">{{VideoFrame}}</span>, | ||
readonly</dt> | ||
<dd> | ||
<p>A background segmentation mask with | ||
white denoting certainly foreground, | ||
black denoting certainly background and | ||
grey denoting uncertainty.</p> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it really only "uncertainty" that's represented? Is it perhaps sometimes partial transparency, and sometimes ambiguity? Could anything be said here to clarify that shades of grey tend more towards the foreground/background based on being lighter/darker? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
</dd> | ||
</dl> | ||
</section> | ||
</div> | ||
</section> | ||
</section> | ||
</body> | ||
</html> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it ever be interesting and feasible to tweak the parameters by which segmentation is done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Atleast on Windows, the platform model does not allow tweaking segmentation parameters today. Using tensorflow.js with
BodyPix
model for Blur, I see there's atleast asegmentationThreshold
parameter. Maybe it's the same asforegroundThresholdProbability
with theMediaPipeSelfieSegmentation
model ?Did you have some other parameters in mind ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not knowledgeable enough on what parameters would be best to include. I was mostly wondering if this is something we foresee extending from a boolean to a set of parameters, and if so, whether there was a viable path for such future extensions given the current API shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Media Capture API, the parameter space is flat and not hierarchical.
As an example, there is a constrainable property called
whiteBalanceMode
which can be constrained tomanual
. If one then wants to manually change the white balance, there is a constrainable property calledcolorTemperature
which can be constrained separately in order to do that.So if we later would like to add a numeric constrainable property called
backgroundSegmentationThreshold
(which could change the segmentation mask to be pre-processed to an blank and white mask according to the threshold without shades of grey) or a string constrainable property calledbackgroundSegmentationModel
(to use the particular AI model), we could certainly do that.