This is the first in a short series of posts about our recent collaboration with Chukong Technologies on the Cocos2d-x game engine. In my first article I will offer you a short overview of our analysis and optimization process for Cocos2d-x based graphics demo called Fantasy Warrior 3D.

What is Cocos2d-x?

Cocos2d-x is a suite of open-source, cross-platform game development tools, written in C++. The engine is the world’s most popular open source game engine and, according to the latest AppBrain data, it is the 2nd most popular game engine on Google Play after Unity.

The engine has traditionally been 2D only; however, over the last few years, Chukong have diversified the engine and added 3D rendering to complement its popular 2D toolset.

What is Fantasy Warrior 3D?

Fantasy Warrior 3D is a showcase project using Cocos2d-x 3.4. It is an import demo that covers several 3D core features in Cocos2d-x:

  • Sprite3D
  • Animation3D
  • Mesh
  • Billboard
  • Camera
  • Light
  • New audio engine

Its main purpose is to show developers how to create a 3D game using Cocos2d-x and it is an ideal starting place for optimizing a graphics application created using Cocos2d-x with our PowerVR SDK toolset.

Fantasy Warrior 3D - cocos2d-x

The code is hosted on GitHub and distributed under the MIT licence:

PowerVR GPU architecture overview

First of all, let us go through an overview of PowerVR TBDR Architecture.

Tile Based Deferred Architecture
Tile Based Deferred Architecture

The architecture has two key features: Tiling and deferred rendering.

Tiling is a process used to increase the efficiency of rendering images on a display. It splits up the geometry data into small rectangular regions that will be displayed as one image, which we call tiles.

PowerVR TBDR - tiling

Each tile is rasterized and processed separately, requiring less processing power long-term as the GPU can use on-chip buffers for colour, depth and stencil buffer read-modify-write operations instead of wasting bandwidth sending data to/from system memory.

For the deferred part, there is a process called Hidden Surface Removal (HSR) which completely removes overdraw.

PowerVR TBDR - deferred rendering

In a typical IMR architecture, the scene displayed above would have the red and purple colours calculated even in places where they would be obscured by the closer shapes. However, our architecture can determine opaque fragment visibility before shaders are executed, which enables the GPU to discard all fragment shading operations that do not contribute to the final image colour. Removing redundant fragment shader execution from the pipeline saves time and processing power. If you are interested in the details of TBDR, you can find more here.

Profiling graphics processors

A good start when profiling is to follow these three rules to ensure you get the best performance from your application.

  • Analyse performance using an appropriate tool, then use the data provided to identify any bottlenecks
  • Figure out where the bottleneck is in your application
  • Modify your application to eliminate the bottleneck.

For example, if you have uncompressed textures and you want to reduce your memory bandwidth, you need to compress the textures to get better performance.

One of the most important factors that people sometimes forget is to complete the cycle and go back to the first step, and run the analysis tools again to verify that the changes have improved performance. This is because you may have had a regression instead or you may have introduced a bug into the application which, in fact, has harmed the performance.

Here are some free tools you can use for the three above steps.

PowerVR Graphics toolset for performance profiling

Our PowerVR tools and SDK provide developers with profiling and debugging tools. These include;

  • PVRTune (a graphics core performance analysis tool)
  • PVRTrace (an OpenGL ES analysis tool)
  • PVRMonitor (an Android application that allows you to view real-time hardware performance stats)
  • PVRScope (a tool for integrating some of the functionality of PVRTune inside your own application)
  • PVRShaderEditor (a tool for shader code optimization).


PVRTune is a real-time hardware performance analysis software. It runs on the device live and lets you see all of the data going straight from the GPU, such as timing data. It also allows you to rapidly view CPU and GPU performance to help you mitigate performance bottlenecks.


Combined with the PVRTrace recording libraries, PVRTune can capture OpenGL ES API usage data, such as API call timing and counters (e.g., number of submitted triangles per-frame and number of texture modifications per-frame). The PVRTrace libraries also enable users to make real-time modifications to the OpenGL ES render state, e.g., forcing all texture samples to use a 2×2 image stored in the graphics core cache.

If you are interested in PVRTune, you can find more here.


PVRTrace is an OpenGL ES call recording library and utility. The tool intercepts OpenGL ES calls and saves them to a file, which can be played back on other devices or desktop machines. This allows focusing on the graphics API calls made by the application rather than diving into an apps rendering engine code.


The recording can then be played back on other devices back onto a Linux system, Windows, etc. to try identify where things are or are not working. PVRTrace comes also as a Graphical User Interface (GUI) that will actually analyse an API call stream, and allows the easy inspection of the calls in a user-friendly manner. One powerful feature of the tool is Static Analysis, which gives you at-a-glance feedback where mistakes have been made with the API, e.g., where uncompressed textures have been used.

If you are interested in PVRTrace, you can find more here.


This tool is an Android application that allows you to view real-time hardware performance stats. The stats show information about processor usage on the CPU and PowerVR graphics hardware, with negligible impact on performance.


The data is presented as a bar graph that is updated in real-time and overlaid on top of your currently running applications.

If you are interested in PVRMonitor, you can find more here.

Coming soon and further reading

In the next post, I will explain how the tools described above were used to analyse the performance of Fantasy Warrior 3D on our reference device.

Here is a menu to help you navigate through every article published in this optimization series:

Please let us know if you have any feedback on the materials published on the blog and leave a comment on what you’d like to see next. Make sure you also follow us on Twitter (@ImaginationPR, @GPUCompute and @PowerVRInsider) for more news and announcements from Imagination.

About the author: Sun Kevin

Profile photo of kevinsunimg

Kevin Sun is a leading PowerVR developer technology engineer for Imagination Technologies.

View all posts by Sun Kevin