Monday, June 26, 2006

Some recent events, CineForm related.

Last Friday David Taylor (CEO, CineForm) and myself presented at the LA Film Festival (thanks to Intel to sponsoring this event.) We had a really nice projector (worth $30k apparently) to show the latest in real-time post-production with Prospect 2k and CineForm RAW -- all running off a single dual core Intel processor, the next generation -- Conroe. Now if only we had time to promote this last minute event (last minute to us), some more of you could have been there. Here is an interview with D.T. at the LA Film Festival of on the topic of The Revolution Will Be Digitized by Jason Lopez of PodTech.net.

Also please check out the latest over on indiefilmlive.blogspot.com, the recent updates from the set of Spoon are engrossing.

Sunday, June 11, 2006

Four Cores -- the More the Merrier

7/28/06 updated with even newer Intel performance numbers
---
We are now seeing a large number of users with dual proc dual core system like AMD Opteron 275s to 285s, and dual core HyperThreaded processors like Intel 965 Extreme Edition -- each of these systems present four processing paths for us to play with. Plus with quad-core single processors on the horizon, we have good reasons to do a little more with our threading architecture. The main reason I've haven't been blogging so much is that improving thread can be a real headache -- particularly as we are already pretty well threaded. Sometimes we might spend a week on a particular algorithm to realize we only got another 5% more speed -- and I want much more than that. For encoding we have N-way threading - that is really nice - and we've had that for sometime. It is made easier by the fact that we can encode many separate frames simultaneously on ingest -- we launch an encoding thread for every CPU available (real or logical core like HT.) Threading the decoder is not so simple, after all that has to be compatible with DirectShow, Premiere Pro and MediaPlayer, etc. Often threading tricks applied to the encoder don’t work as well with the decoder. We do now have an N-way threaded decoder for extreme playback, 4k etc, but not yet for editing – we’re still working on that.) Previously our products included an efficient 2-way threaded DirectShow decoding component which was ideal for single dual-core processors, dual Opterons 248-254 and HT enabled P4s. Yet on quad systems we were seeing the encoder is faster than the decoder (because of N-way threading) -- rather odd behavior for a symmetric compression technology (encoding and decoding should be similar in performance.) – Sorry, that was perhaps lots of boring background about the new threaded decoder that just shipped in all our products -- it will be up to 30% faster on quad system over the previous decoder.

On a related subject, it has been suggested that we make a standard encoder/decoder test suite to characterize real-time video processing performance on various CPU/memory/drive configurations. What do you think? Here is a sample of recent characterization we measured for the decode speed using a very demanding sequence captured from an XBOX 360 at 720p (note: consoles produce more demanding data then anything acquired by a lens -- a lens adds natural anti-aliasing which is easier to compress and playback, whereas gaming material typically has an infinite depth of field and harder edges. So HD camera playback rates are higher than these shown.)

Machine
Decoder 2
Decoder 3
Yonah 2.0GHz, 667 FSB
60.17
Merom 2.0GHz, 667 FSB
Merom 2.33GHz, 667 FSB (new data)
84.42
-

122.66
Pentium D 840 EE 3.2GHz, 800 FSB

93.65
Pentium D 965 EE 3.73GHz, 1066 FSB
110.83 146.61
Conroe 2.66GHz, 1066 FSB (next generation desktop) 122.81 137.59
Glidewell 3.2 GHz, 1066 FSB
Woodcrest 2.66Ghz, 1333 FSB (new)

117.74
-
223.75
269.92

All numbers are in frames per second for a full-resolution, full-quality decode. Decoder2 is used in our Premiere Pro editing solution (Aspect HD and Prospect HD); its preview mode literally doubles and triples these numbers. Decoder3 is our N-way threaded presentation codec (think Digital Cinema.) The Glidewell's (new Xeon) 223.75 fps was due to the 8 threads (!) that can run on HT dual-core dual-proc Xeons (4 real cores.) 720p at 223.75fps equates to 100fps 1080p and 30p for 4k at 2.35:1 (note: higher resolutions are more efficient on compression, so the frame rate numbers are very conservative.) And we have more optimization coming. :)

New Data 7/28/06 : Same tests on the new Woodcrest Xeons hits 270 fps, with a lower clock speed than Glidewell and no HT. Amazing!

Note: For those who want AMD numbers, these processors are also very good (and have been for a while.) Yet we have older AMD systems in-house than these newer Intel boxes, so I can't show the best AMD vs the best Intel. Intel is certainly bringing-it-on and is currently the performance leader in our office; previously we only recommended AMD for Prospect HD Ingest - not so any more.

Saturday, June 10, 2006

Creative Cow Interview with Jacob Rosenberg

Here is the direct link to the Creative Cow pod cast, where Jacob Rosenberg discusses all things Premiere Pro with some focus on past, present and future HD/Film projects using CineForm. Plus he drops my name so I have to link it. :)

Interview with Jacob Rosenberg