Machine vision is getting really awesome. And Apple's coreML offers some pretty accessible Machine vision.

Tools like https://docs.lobe.ai/docs/export/export/ make it easier to train custom coreML models.

Request: A coreML node that uses the coreML framework, and outputs results so that they can be used in the composition.

Could be a Pro feature?

Component: 

Comments

keithlang,

jmcc's picture
Submitted by
Feature status:
Waiting for review by Team Vuo
»
Waiting for more information from reporter

keithlang,

Just to begin with, Core ML currently has 6 very different model types. Our goal with Vuo is to keep nodes understandable and task-focused. We think this is too broad for a Vuo feature request. Can you submit requests for specific things you'd like to be able to do using ML? We have several requests for tracking in images: Camera tracking, Tracking blobs, that seem similar to the types of things you might want to do with Core ML.

I am specifically interested

keithlang's picture
Submitted by

I am specifically interested in hand tracking. Ie, imagine a node that was similar to the Find Faces in Image node. 'Hand and Body Pose detection' came as part of Big Sur.

One current workaround, running HandPose OSC standalone app is ok, but is limited to one hand, and doesn't have the option of sending single frames for recognition. It also has the downside of not being able to be included in App export, and I assume has some tradeoffs in performance as a separate app.

Vuo has built-in support for

jmcc's picture
Submitted by

Vuo has built-in support for hand tracking using the Leap Motion, but not with normal RGB (non-depth) cameras.

HandPose OSC uses TensorFlow's handpose model (not CoreML). That model uses the Apache license, so we could potentially integrate it into Vuo (though, as you noted, it currently only tracks a single hand at a time).

Apple provides VNDetectHumanHandPoseRequest (only available in macOS 11.0+). We briefly tested their sample code; it's slower than HandPose OSC, and it also only detects a single hand at a time.

So, for single-hand detection, we have 2 options, at a two-dot complexity (and Pro only).

For multiple-hand detection, we could probably train a model to segment the image into separate per-hand images and run the existing model on each of those sub-images, but that would bump it up into 3-dot complexity.

Let us know if you want single or multiple-hand detection, and we'll modify the FR accordingly and open it for voting.

Feature status

When we (Team Vuo) plan each release, we try to implement as many of the community's top-voted feature requests as we have time for. Vote your favorite features to the top! (How do Vuo feature requests work?)

  • Submitted to vuo.org
  • Waiting for more information from reporter