Kewl's picture

I tested the same project with Vuo 1.2.3 and 1.2.4 and with OS 10.11.4, 10.12.1, and 10.12.2 beta: my observation is that there is a serious performance drop when running Vuo in OS 10.12.

Since I had not done any Vuo work since September, I thought that maybe I left my project in that underperforming state. But no, it was performing OK two months ago. So I looked into Vuo versions since it has just been updated. But no, same bad performance in Vuo 1.2.3 and 1.2.4.

I then turned to the OS and tested on two 2013 Mac Pros under OS 10.11.4, 10.2.1, and 10.12.2. Under 10.11.4, I easily get the 25 fps I programmed (set by the Fire Periodically node), but when I run the composition under 10.12.1 or 10.12.2, it drops to around 6 fps.

Anybody else has encountered this?

Comments

@Kewl, I appreciate your

jstrecker's picture
Submitted by

@Kewl, I appreciate your taking the time to investigate the problem.

Some of our team have been running Vuo frequently on macOS 10.12 and haven't noticed any blatant performance drops like the one you're experiencing. It could possibly be related to the combination of 10.12 and your GPU. What GPU do you have?

The slowdown with Build List and Process List may be due to whatever is happening within their feedback loop, rather than those nodes themselves. Can you identify any particular part of the composition that is slow (i.e., removing it from the composition significantly improves performance)?

I have seen the slow down on

Kewl's picture
Submitted by

I have seen the slow down on these Mac models:

http://www.everymac.com/systems/apple/macbook_pro/specs/macbook-pro-core...

http://www.everymac.com/systems/apple/mac_pro/specs/mac-pro-quad-core-3....

http://www.everymac.com/systems/apple/mac_pro/specs/mac-pro-eight-core-3...

As for the GPU, on the MacBook Pro, it's the NVIDIA GeForce GT 750M, and on the Mac Pros, it's the AMD FirePro D300. The screenshots I posted are from the 2013 Mac Pro 8-core 3.0 GHz.

In my problematic composition

Kewl's picture
Submitted by

In my problematic composition, there are two sections where the performance is cut in half when I use OS 10.12.

The 1st section is circumscribed by a Build List: going into that section, the performance is at 25 events/sec, going out is between 12 and 16 events/sec. Following the 1st section, the 2nd section is circumscribed by a Process List: going into that section, the performance is between 12 and 16 events/sec, going out is between 6 and 8 events/sec.

What is similar in these two sections:

  • 1st section has two Calculate nodes, one Average and one Make 4D point;
  • 2nd section has six Calculate nodes and one Make 2D point.

We were able to reproduce the

jstrecker's picture
Submitted by

We were able to reproduce the performance drop in this composition in 10.12 as compared to 10.11. (Thanks for emailing it, @Kewl.)

After experimenting with different variations on the composition, and taking time profiles to see how much time was spent in each part of the code, we found that the problem is not specific to any any node or node type. It has to do with low-level code that is used to synchronize different parts of the composition as they run in parallel. A data structure called a semaphore, which is part of Apple's Grand Central Dispatch library, is now taking about 3x longer to do its job in 10.12. We'll need to improve our code to avoid relying on semaphores so much.

Ha. In further testing, we

jstrecker's picture
Submitted by

Ha. In further testing, we found that the dispatch semaphore was 12x to 27x slower on 10.12 than on 10.11 when under heavy usage by multiple threads. We have reported the bug to Apple (rdar://29473993).

Not knowing when or if they'll fix it, we went ahead and switched to an alternative synchronization thingy called an atomic spinlock in the parts of the code that were most affected by the slowdown. This fixes the slowdown on 10.12 and also improves performance in 10.11. To be included in Vuo 1.2.5.

So I'm guessing you have

Kewl's picture
Submitted by

So I'm guessing you have eliminated any dependency on Apple's Grand Central Dispatch library?

I created a 10.11 partition on my MacBook Pro just to run Vuo, but if the performance is the same with 10.12, I'll ditch the 10.11 partition...

No, we haven't eliminated the

jstrecker's picture
Submitted by

No, we haven't eliminated the dependency on Grand Central Dispatch. The other parts of it besides semaphores (dispatch queues, dispatch groups, etc.) are still working fine and are quite useful. Semaphores are sometimes OK, too; an occasionally-used one doesn't make any noticeable difference. The noticeable slowdown is when you have a frequently-used semaphore that is under heavy usage by multiple threads. That is what we fixed in Vuo 1.2.5.

Has the dispatch semaphore

Kewl's picture
Submitted by

Has the dispatch semaphore bug reared up its ugly head again?

I have a composition where I add these nodes: Adjust Image Colors, Blur Image, and Apply Mask. Without these nodes, the composition uses around 300% CPU and there's no dropped events. When I insert the three nodes, CPU load drops to about 100%, 150%, but now with a lot of dropped events.

Without the three nodes:

With the three nodes: