“There is truth to be found on the unknown shore, And many will find where few would seek.”
[Note: The quote above is correct. You'll often find it written "And many will find what few would seek" due - I suspect – that line's appearance in the first chapter of the popular book "Declare" by Tim Powers which is how came to know of it. You can find the original text from Stephen's Lapsus Calami ]
It’s known that the PNG images can be optimized for smaller sizes. It’s also known that the PNG encoder that comes with the .NET Framework, doesn’t do much (if any optimization) optimization. While preparing for a blog post on screen capture techniques, I ran into these facts again.
But this time, armed with 11 thousand screenshots over the last 7 years, I decided to look at the numbers. What I found so far has confirmed some of my expectations and surprised me elsewhere. Let’s explore…
I have a collection of screenshots (11206). All of which have been taken using versions of Hypersnap through the years. I’ve take many more screenshots, but these I have saved for reference.
Almost all of these are on Windows XP, Vista, Windows 7 – with a smattering of Ubuntu thrown in. Typically these are the screenshots of applications and web pages. None are “game” screenshots. I consider them are a reasonable sample of “real-world” screen captures.
All the images were stored as 24bit PNG files – again all generated by Hypersnap.
I simply used .NET 4.0 via Visual Studio 2010 to load each image, and encode it as a PNG using the .NET Framework’s encoder. To be specific I saved the PNG into in a memory stream for performance. I confirmed that the encoding in the stream matched exactly what would be produced to a real on-disk file.
On each conversion I collected the original file size in bytes and the output size in bytes.
HYPERSNAP ENCODER VS. .NET FRAMEWORK ENCODER
In general, the original Hypersnap encoded screenshots were smaller. I certainly expected Hypersnap to do better (94%). I hadn’t expected that it in some cases the .NET Framework encoder would do better (6% of cases). At no point did any of the files have the same size.
The raw numbers:
- Hypersnap Smaller 94% =
10517 .NET Framework Smaller 6% = 689 Equal Sizes 0% = 0
- Total = 11206
HOW MUCH BIGGER ARE THE .NET FRAMEWORK ENCODED FILES?
Let’s consider the ratio of the .net encoded size, versus the Hypersnap size. The larger the ratio, the worse the .NET framework is doing – the bigger the file it is creating
Looking at all the 11206 files, the average size ratio is 1.25. On average the .NET encoder produces file sizes 1.25 times that of the Hypersnap encoded files. In other words Hypersnap files are about 80% smaller on average than what .NET System.Drawing produces.
Now consider the actual bytes (as reported by the System.IO.FIleInfo object’s Length property)
Total file size for Hypersnap = 1,595,099,129 bytes = ~1.49 GB
Total file size for .NET = 2,006,670,605 bytes = bytes = ~1.87 GB
Delta = 4,1157,1476 bytes = ~0.38 GB
So, it looks like .Hypersnap is saving me about a third of a GB of file size.
THOUGHTS AND QUESTIONS
- If you are relying on the .NET framework to encode PNG files you should consider the size penalty you will, on average be paying.
- It’s unclear why Hypersnap is ever worse than the .NET framework – when I looks at the actual images that Hypersnap did worse at, I could not find a clear pattern.
- It’s worth finding out what happens to these PNGs when using a specialized tool to optimize the size. I’ll try this soon.
I started the viziblr blog on this date exactly one year ago with this announcement. Now, I’ll take a moment and reflect on the experience and think about the future.
I’VE HAD FUN. I’M STILL HAVING FUN
Writing about the topics that fascinate me has been very enjoyable. I was concerned that I wouldn’t have enough to say or that my initial enthusiasm would fade – and although I have had the occasional slow period – I have kept up activity throughout the year.
HELPING OTHERS IS REWARDING
It pleases me to no end, that my most popular post is about fixing the problems people are having with their Wacom tablets and windows 7. In know the problem has been very frustrating, so to be able to help so many people. Every time I get a comment on that topic and someone says thanks, truly warms my heart. In a similar vein, those blog posts that introduce readers to new technologies or provide tips or guidance around a topic have been well-received. I want to keep playing this role.
I get a lot of traffic on my MSDN blog – it’s around the 150th in popularity across all Microsoft-hosted blogs (MSDN, TechNet, etc.) – according to the numbers I get sent every month. BTW, we have thousands of blogs – being at 150 from the top is not bad. Viziblr isn’t as popular – which is OK with me – this is not about numbers. I have no ads on the blog, I have to traffic goals, I do this out of my passion only. If the numbers are interesting at all for me it’s only to let me know that I’ve got something useful to share.
- Original content. If you read the blog, you know I have a balance between simply sharing interesting things I find and creating new and original content for posts. For this next year, I’ll be focusing even more on producing original material.
- Screencasts. These have been well received and enouraged by the feedback I will be doing many more over the year.
- Code. I’ve already published my first tool under the Viziblr branding. This facet of the blog will get ramped up quite a bit this year. Some of this will tools, libraries, or simply discussions of techniques – but it will all be related to visualization, infographics, etc. and not general coding.
- Visio. Many people know I use Visio a lot to create charts and infographics. Also I have a lot of “power user” knowledge that is can be very helpful. Currently, I’m doing most of this blogging on the MSDN blog, but transition to this blog over the next few months.
- Infographics. Expect this to be a frequent topic for the posts over the next year.
Really, thanks. Thanks for every comment, question, or view. It is always appreciated.
I read a good blog post today by Jeffrey Engel about his tools used for information design.
Certainly, as he describes to express what one wants one should use the tool best suited to the task. To that end I would only add that many simply are unaware of how expressive Visio can be. To help fill that gap, I recreated portion of his chart in Visio 2010.
Here is the original:
And my version using Visio 2010 (Click to see a larger version)
Some of the Visio magic is not apparent – the 3D box shapes have darker sides as you can see. However, the darker sides colors are *automatically* determined by the color of the front side. So formatting all those 3D boxes was a trivial exercise. Having done this once now, I imagine one could recreate the drawing less than 10 minutes.
I don’t believe that there is one tool that will meet everyone’s needs when designing things like this and certainly Visio has its limitations. And it also has great strengths. If you have access to it, try your hand at using it for information design – I think you’ll discover it has a lot to offer.