How Fast is Molehill?
Molehill is out!
photo © 2005 Andi | more info (via: Wylio)
Adobe took everyone by surprise at the Flash Gaming Summit 2011, by making available on Adobe Labs their super secret new API, the 3D hardware accelerated Flash Player, codename Molehill.
Molehill is still not officially released. It’s not even in beta. What Adobe is doing is to create a sort of pre-beta program, called the Flash Incubator Program. It is a program aimed at developers who are interested in testing and providing feedback on potential features and features under consideration for future versions of the Flash Player. Features highlighted in the incubator program may or may not ship in future releases of the Flash Player.
So, we still don’t even know when the real first release of the Molehill Flash player will be out for the general public.
But…who cares! The toy is out, we can play with it and talk about it.
Why is Molehill important?
Performance!
The Flash Player has an incredible penetration rate. It’s installed on more than 90% of all computers around the world. And it’s amazing how much its technology has been lagging. Running a Flash application is like switching to a 15 years old computer. The Flash Player till now has been unable to take advantage of the 3D hardware acceleration available on most computers today. And this means that it’s not been possible to create 3D Flash experiences that even get close, in quality and complexity, to those that can made in standard native applications.
Ok, But how fast is it?
This is the 1 Million Dollar question, that everybody is going to have.
With the 3D rendering techniques we’ve all been using so far in Flash, all the rendering took place on the CPU. And, like I said, this meant super simple models and scenes.
How fast is the Molehill player going to be, and how detailed are the models that we can now display?
I am going to provide some quantitative data about Molehill performance.
Let’s start with the CPU based rendering. The old way, used by all current 3D engines, such as Papervision3D and Away3D.
First of all, download the Molehill Flash Player here and install it on your computer. (Disclaimer: it is not beta software. It’s PRE beta. Install it at your own risk. Molehill is cool, but if it fries your computer, don’t come to me and complain).
Here is a sample application that renders a high detail 3d model of a “stone spring” using the standard CPU based drawTriangles call, available in Flash Player 10. The model consists of about 20000 triangles, 4000 for each coil of the spring. It is rendered using back face culling, so the actual triangles that get rendered at every frame is about half of that: 10000 or so.
You can see the number of polys and the frame rate in the hud at the top left. On my MacBook Pro, this 3D scene runs at less than 10 frames per second. meaning that 20000 triangles is way too much for a software rendered 3D scene, drawn with the standard Flash Player 10 call, drawTriangles, that’s used for 3D rendering.
Let’s compare this with the Molehill renderer. Here’s the same model rendered with Molehill (close the previous browser tab with the CPU renderer before opening this one). On my Mac, this runs super smoothly at 60 fps (except at startup, which takes a bit. But that can be optimized and it’s not really a measure of rendering performance).
Cool! Let’s increase the number of polys now. Here’s a scene of a similar model, made of 80000 polys rendered with Molehill. Again, it runs super smoothly.
There’s no point in running this one with the CPU renderer, as it would slow to a crawl.
So, the conclusion is that Molehill can render 80000 triangles models without a hitch. An 80000 triangles model is quite of a detailed model. Can we go any higher?
The behemoth test
Unfortunately there’s a hard limit in Molehill’s Vertex Buffers. They cannot contain more than 64k vertices. This particular geometry that I used for the 80000 triangles spring contains about 40000 vertices. So, there’s not really room to go much more detailed than this by using a single spring model.
What we can do is to create a scene with more than one spring. Here is a scene made of four 80000 triangles springs.
It consists of about 320000 triangles! And it runs super smoothly, still at 60 frames per second. Note that each spring is being rendered with its own Vertex Buffer and Texture on the GPU. Therefore, as far as the GPU knows, the four springs are four different models.
Let’s double that! Make it 8 springs, of 80k triangles each. Here it is.
This scene does take a bit to setup. But once it’s all loaded, it still runs pretty smooth. At around 30 frames per second. So, the frame rate halved, but we’re rendering 640000 triangles, all on the screen at the same time!
Look Ma…1 Million Polys!
Ok, now for the last test. Let’s be masochistic and double our springs once more; let’s get 8 more on the scene. Here‘s the result. Note, this last test will take quite a bit of time to setup. It will crawl your computer for a few minutes while the scene is being created. Just be patient.
Once the scene is uploaded to the GPU and setup it still manages to run. Frame rate on my Mac is around 15 fps. Not great, but hey, we’ve got 1.2 Million triangles on the screen!
A few caveats
This is a very specific kind of test, and it doesn’t assess the performance of the Molehill rendering pipeline in all possible kinds of rendering situations.
It’s very rare that a scene consists of a single, or a few, super detailed models, like I did here. More frequently you’re going to have a multitude of much less detailed models. And, with a similar total triangle count, that would probably end up being slower than our sample, depending on the number of models.
Generally, the biggest bottleneck in a 3d rendering pipeline is fill rate. Meaning how many texels per second can be filled by the video card. With this test I’m not really pushing the pipeline fill rate, as all the triangles that we are rendering end up being extremely small on the screen.
Also, I used a super simple Shader here. Here’s the code of the shader, in AGAL Assembler.
Vertex Shader:
m44 op, va0, vc0 // pos to clipspace
mov v0, va1 // copy uv
Pixel Shader:
tex ft1, v0, fs0 <2d,linear,nomip> // sample texture
mov oc, ft1 // copy sampled texture to output
The Vertex Shader simply transforms the 3D vertices to the screen. And the Pixel Shader maps a single texture for the model without any lighting or effect. The texture is 2048×2048 1024×1024 pixels.
Many times a real model will be using more than one texture, and it will have more complex shaders.
Nonetheless this test is a good as a rule of thumb reference point: in the simplest rendering case, Molehill can render a single textured model with 80000 triangles like a piece of cake. A scene with 320k triangles can be rendered quite smoothly, and a 640k triangles is still ok to render. A 1.2 Million triangle scene is a bit extreme, but it doesn’t completely crawl the computer.
Oh, note that I did all these tests with the integrated NVidia 9400M GPU of my Mac. A dedicated video card will probably run these tests faster.
Update:
I just scaled down the texture to 1024×1024 for all the samples, as I received some error reports from some users.
Spread the love!
If this article was useful to you, chances are that it might be useful to many other people. Link back to it, by using this permalink, and let other people know about it.
I am interested in getting a sense of what the performance difference is on other 3D platforms such as Unity and will be exploring that soon, any advice is welcome.
Filed under: Performance
Like this post? Subscribe to my RSS feed and get loads more!






The scene with 1.2 million triangles still renders at 60fps on my GPU (8800GTS 512), but it does get a bit hotter than idle
I also got a couple of invalid BitmapData calls for the texture while it was loading, but it seemed to work fine after I ignored them.
Hey that’s cool!
Could you please post the error messages you get…I can try to fix them.
Here are 2 of the errors I received when trying to load the 1.2 million poly example. The first error appeared approx. 8 times as I ignored and hit Continue. The second error persisted until I dismissed and closed the window.
Error: Error #3684: Texture creation failed. Internal error.
at flash.display3D::Context3D/createTexture()
at unit9.tanuki.renderingsystem.renderers.molehill::MolehillMesh/getMolehillTexture3D()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/renderingsystem/renderers/molehill/MolehillMesh.as:234]
at unit9.tanuki.renderingsystem.renderers.molehill::MolehillRenderer/renderGeometry()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/renderingsystem/renderers/molehill/MolehillRenderer.as:150]
at unit9.tanuki.scenegraph::RenderableNode/vRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/scenegraph/RenderableNode.as:100]
at unit9.tanuki.scenegraph::SceneNode/vRenderChildren()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/scenegraph/SceneNode.as:349]
at unit9.tanuki.scenegraph::Scene/onRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/scenegraph/Scene.as:104]
at unit9.galvanous.scenegraph::GalvanousScene/vOnRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/galvanous/scenegraph/GalvanousScene.as:61]
at unit9.tanuki.views::HumanView/vOnRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/views/HumanView.as:141]
at unit9.tanuki.application::GameApp/onRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/application/GameApp.as:260]
at unit9.tanuki.application::GameApp/onMainLoop()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/application/GameApp.as:229]
Error: Error #3691: Resource limit for this resource type exceeded.
at flash.display3D::Context3D/createTexture()
at unit9.tanuki.renderingsystem.renderers.molehill::MolehillMesh/getMolehillTexture3D()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/renderingsystem/renderers/molehill/MolehillMesh.as:234]
at unit9.tanuki.renderingsystem.renderers.molehill::MolehillRenderer/renderGeometry()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/renderingsystem/renderers/molehill/MolehillRenderer.as:150]
at unit9.tanuki.scenegraph::RenderableNode/vRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/scenegraph/RenderableNode.as:100]
at unit9.tanuki.scenegraph::SceneNode/vRenderChildren()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/scenegraph/SceneNode.as:349]
at unit9.tanuki.scenegraph::Scene/onRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/scenegraph/Scene.as:104]
at unit9.galvanous.scenegraph::GalvanousScene/vOnRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/galvanous/scenegraph/GalvanousScene.as:61]
at unit9.tanuki.views::HumanView/vOnRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/views/HumanView.as:141]
at unit9.tanuki.application::GameApp/onRender()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/application/GameApp.as:260]
at unit9.tanuki.application::GameApp/onMainLoop()[/Users/marco/Documents/Projects/tanuki/svn/trunk/project/develop/Tanuki/as3/shared/unit9/tanuki/application/GameApp.as:229]
Ok, I should have fixed it.
I scaled down the texture to 1024×1024 and it seems to be much less memory and resource hungry.
WoW, excellent… these are exciting times for flash. Seems a couple years late, but at least they finally made a step in the right direction. I wonder how long it will take to get molehill running well on mobile through air.
http://www.davegeurts.com
The Geforce 9400M isn’t exactly an integrated graphic card, but the newer integrated graphics perform like it (hd6310 and sandy bridge HD3000).
It’s pretty impressive
. I want to test this on old computers to see the new software renderer.
Hey, thanks for the comment!
Yeah, I agree. It’s not totally correct to call it “integrated graphics”. What I mean is that its role, for example on the previous MacBook Pro is to be the lesser of the two video cards available, using shared memory, etc. This is similar to the role often played by integrated Intel graphics.
Got 60fps on all of the tests, including the first software render test.
I guess I need to use a slower computer – but this is only a 15″ laptop, running Windows 7. Someone ought to try the tests on a dual booting Mac to see how the Mac vs Windows performance compares for Molehill. Flash-based Mac has always lagged behind the Windows 7 version for a while to varying amounts, but am wondering how it fares with Molehill?
Thanks to your post, I just realized that in my latest update I mistakenly overwrote the software rendering test with the Molehill one. I just fixed that, so it should be running ok now.
Btw, in that last update I had scaled down textures, and removed mip mapping (I did that as the application was crashing on some users). So performance improved a bit also on my ;Mac. I’ll update those numbers.
Good call on the dual boot Mac! I will test that too!
Hi,
Nice demos, got 60fps on 1.2M poly test, on an ATI Radeon 4800. Could you also post the source files? I’d like to modify them to increase the number of models on the scene to test the limits on my machine. That is, if you didn’t already post the sources and I missed them on the page
Thanks
Hi,
the current demo uses a 3d engine we’re working on, that’s still unfinished, and don’t want to release at this stage.
However I’m working to extract the code of this demo as a separate app, so that I can make it available for download.
As soon as I’m done with that I’ll be posting the code.
Thanks for the comments!
Awesome! My GTX 550 runs 1.2 milion triangles at 60 FPS!!!
I AM SO MOVING TO FLASH PLATFORM RIGHT NOW.
But wait, doesn’t this mean the death of Java 3d games?
Performance on an older slower laptop: 1.8 gigahertz Intel Core Duo 1GB RAM Win XP(SP3) running nothing else.
Reset after spring appeared and moved it a little to get ave fps/cpu usage
orig flash 6fps cpu 55%
mole20k 25fps cpu 75%
mole80k 2fps cpu 95%
mole320k 2fps cpu 95%
mole320k 2fps cpu 98%
last line should be mole640k not mole320k – did not try 1m one.
I am running it on Linux, I am not getting any output, though the stats section is displayed and the stats are getting updated (current fps, avg fps, time etc). Any idea?
Some info
– I have a nvidia quadro fx 1800 with nvidia driver version 270.41.06
– Hardware rendering is used, the cpu usage is low
– Using flash player 11.0.1.134 (latest one available)
– CPU rendering example works and displays correctly (albeit with a low frame rate)
– There is a line at the bottom which reads “RenderMode: Software (Direct blitting)
Suresh I have the same problem.
In my opinion is the flash player changements from the PRE-beta these tests was targetting to.