The 8 hour journey to a single character
TweetI came into work on day not too long and was met with an unfortunate piece of information. The test we ran the previous night reported a 4% drop in accuracy (we found 4% fewer faces). I frowned. I had clearly screwed something up.
You see, the day before, I found that our code was doing some really pathological and unnecessary color conversions on images. We were using our video i/o library (a very thin wrapper on ffmpeg) to read in movies and allowing it to convert the frames into RGB. As soon as our code got ahold of these RGB frames, we immediately converted them to grayscale. The pathology of the situation is that the vast majority of all video formats are already in the YUV color space (or YCbCr color space. I will use the terms interchangably, but you should know they aren’t quite the same thing). For those among you that aren’t image processing nerds, the Y-channel of YUV is functionally equivalent to a gray channel. In other words, videos are almost always encoded with a gray channel and 2 color (chrominance) channels. That means converting these images to RGB just to convert them back to gray is beyond wasteful.
We had actually known about this problem for awhile but it’s never been a terribly pressing issue. That changed recently when we were doing some really high speed processing and we noticed our color conversions were a non-trivial portion of the time. I dug into our video i/o library, removed the conversion to RGB, and piped the Y channel of the YUV frames into our software. Everything appeared to be working perfectly. Output looks good. No memory leaks. Commit. Go home. Watch House.
The next day…
So I walk in, bright and early, and hear the disappointing news. We lost 4% accuracy. Did my colorspace change cause this? I hope not. Within a half an hour, I realized that my colorspace conversion change clearly was not correct. The frames that I had tested on the previous day looked identical, but they weren’t. I expected some minor round off error as a possibility, but the changes were more than I anticipated.
The error became extremely obvious when I ran the new decoding library on the opening of Star Trek. Conveniently, the opening of Star Trek is pitch black. My “new” library was not producing black! It was near-black, but very clearly NOT black. Uh oh. What’s going on?
Let’s look at the numbers
I dug into ffmpeg and very quickly reproduced the problem. The very first pixel of the very first frame of this episode of Star Trek had a Y value of 16. Not zero. When ffmpeg converted this value to RGB, the result came out to be (0,0,0). Black. Wait a minute, since when is Y=16 equal to black? What is going on?
I looked at the only other place in our codebase where we deal with this type of information: JPEG decoding. JPEG’s also use YUV formatting. What would this particular frame look like in a JPEG’s YUV? This is a quick test. Is it also 16? Nope, it’s zero.
For an RGB value of (0,0,0), ffmpeg is telling me the Y value is 16, and the jpeg library is telling me the value is 0. Either there is an egregious bug in one of the two most well-tested libraries on the planet, or I clearly don’t understand wtf is going on. I’m going to assume the latter. But that will have wait until after lunch.
Research time
When I got back from lunch, it was time to hit up the internet. If you search the internet for formulas to convert yuv to rgb you’ll get all sorts of conflicting information. You’ll even get very different formulas. If you read the YUV or YCbCr Wikipedia pages, you can easily miss the most important information (hint: it’s not the equations). After spending a tremendous amount of time reading (and being confused about the different formulas), I made the critical discovery. There are different definitions of YUV.
I then had to dig into the details to find out just how deep this particular rabbit hole goes. In the end, it wasn’t terribly complicated (but it was difficult to find good information). In essence, though, different standards define different dynamic ranges for the YUV color space when digitized into 8-bit per sample. Movie files (e.g, MPEG) will often use 16-235 for the Y channel (black->white), while images (JPEG) will use 0-255. A movie file’s white (235) != a jpeg file’s white (255). To make matters worse, the Cr and Cb (ie, U and V) channels use an entirely different set of dynamic ranges for MPEG files (though jpeg is always [0-255]). Oh my.
Note: if you are here because you are having similar yuv/rgb problems and google led you here, I strongly suggest you read every single word of these three links:
The Fix, part 1
If all I need to do is rescale to a different dynamic range, that is not a difficult problem to solve. It’s a fair bit tricky (watch those unsigned overflows!) but it’s nothing that can’t be accomplished through the power of C. I spent an hour or so writing a function to convert the three channels to the expanded dynamic range (remembering that the Y channel uses a different range than the U and V channels). I knew I’d lose some information, but what choice did I have?
Once I had finished, I ran all my previous tests and found the output to be far, far better than the one I was using from the previous day. I also tested my new conversion routine on images that failed from the overnight test and what do you know, they were now finding faces. Mission accomplished!
Not so fast my friend
It was about this time that I felt the need to vent. Seriously, movie and jpeg people, why are you doing this to me? Why are there two (note: actually more than two) different dynamic ranges for 8-bit YUV pixels? Why oh why? (more notes: if you want to learn why, it’s actually a fairly fun and interesting story… taketh thee to wikipedia).
In the need for some complaining, I decided to go onto IRC and complain to the only video developer I know (he works on x264 — the open source h264 encoder). I asked him why “they all” go around screwing with people like me with such nonsense. He laughed and went on to explain there are actually more than two different formats and commiserated with me for a moment. And then he said something important. He said only “--fullrange“. Wait. What is --fullrange? Is that an x264 parameter? Yes, yes it is. What does --fullrange do? It uses the fullrange of YUV. Ah! x264 devs are genius. Why would they leave this silly conversion to us?
Oh wait. Does that mean… does ffmpeg… do it too? It has to, right? Let’s check the docs, shall we. There sure are alot of formats on this list. I wonder if any of them are “full-range” YUV.
PIX_FMT_YUVJ420P Planar YUV 4:2:0, 12bpp, full scale (jpeg).
PIX_FMT_YUVJ422P Planar YUV 4:2:2, 16bpp, full scale (jpeg).
PIX_FMT_YUVJ444P Planar YUV 4:4:4, 24bpp, full scale (jpeg).
Does that “jpeg” mean these are “jpeg-style” full-range YUV outputs? I should try this. Within minutes I realized that yes, these formats outputted YUV channels that used the full dynamic range 0-255. Excellent. I reverted all my ugly changes with my own customized range expansion code and committed this final fix.
- PIX_FMT_YUV420P, + PIX_FMT_YUVJ420P,
One character. One friggin’ “J”. 8 hours. I hope no one is keeping track of “lines of code per hour”.
Tweet
twitter
November 23rd, 2009 at 9:06 am
i work in an APL derivative. my best days are the ones with negative LOCs.
November 23rd, 2009 at 10:20 am
That is a great story. Early in my career, I once spent about 36 hours chasing a bug that turned out to be a single misplaced closing brace, but this is better (for some value of “better”).
November 23rd, 2009 at 11:49 am
Great story. Reminds me of a time we brought down the site when we pushed out a / that was supposed to be *
November 23rd, 2009 at 12:20 pm
In college I once spent over 2 hours on:
if(something false);
{
}
CONVINCED that the compiler had a bug. Note: the compiler never has a “bug”,
although I have triggered asserts in javac (with code that violated a trciky part of the spec)
and I have code that can crash msvc’s compiler =).
November 23rd, 2009 at 1:54 pm
wow, you are an idiot. i feel bad for you. you work on image processing for a living?
November 23rd, 2009 at 1:56 pm
heh, everyone knows that the J comes between the V and the 4.
November 23rd, 2009 at 1:57 pm
A poignant illustration of the chaotic nature of computer programming.
November 23rd, 2009 at 4:02 pm
I recently learned that removing list of new products from oscommerce start page and replacing it with list of all categories, requires replacing ” with ’0′ in one place in includes/application_top.php
November 23rd, 2009 at 5:40 pm
Been there, done that. For some reason IE wasn’t rendering the page I wanted right, the IE hacks I had used before were failing being on a deadline I did something bad (alternate ugly stylesheet)
eventually I worked out that IE was rendering this page (but no other) in quirks mode. Doctype is there and correct, the html is a well-formed XML tree, what could be wrong?
Turns out the page is saved as UTF +BOM and the others as UTF -BOM and for some reason the web-server is emitting the BOM at the start of the file, IE is seeing something before the doctype and going into quirks mode. When I look at the source it seems clear (due to the BOM being invisible)
Remove the ugly alternate stylesheet, fix one line of code push changes into source control.
November 23rd, 2009 at 6:27 pm
If you come from a straight computing background, you’ll mostly be exposed to full-scale colour channels, and probably think everything uses 24-bit RGB with 0-255. But in the professional film/video industry, there’s a ton of different standards, formats and colour spaces, and the rescaled channels are very common. Also, 10-bit per channel video is normal these days. There are also very good reasons for doing things this way – the 16-235 scale wasn’t put there to sneak a bug into your application. It’s to allow headroom and toeroom during post-processing. It is worth understanding the history and rationale behind something before dismissing it as “nonsense” – there’s usually a pretty good reason.
The moral of this story: always keep your colour space in mind. Even if you’re looking at RGB, is it Rec.601 or 709? Gamma corrected or linear? And you *cannot* use the terms YUV and YCbCr interchangeably and still be accurate – they are NOT the same. Please be precise with your terms, or you merely propagate the confusion.
November 23rd, 2009 at 6:37 pm
I’ve had so many experiences like this. Rarely spanning multiple days though.
Patience is the most important programming skill.
November 23rd, 2009 at 6:53 pm
Single character bugs are the best ones. Without going into much detail, some of the single characters that have caused me the most hours of work are:
!
=
*
November 23rd, 2009 at 6:54 pm
Ugh; last post should have included:
<
>
November 23rd, 2009 at 7:38 pm
I once chased a strange floating point error in a calculation routine. When I traced it in CPU-view, I found out that the compiler assumed the ‘direction flag’ was always 0, which it wasn’t in this case. An __emit__(“CLD”) in the right place fixed the error.
November 23rd, 2009 at 10:03 pm
Those one character bugs are the best ones. Honestly, I love the feeling of tracking what feels like a huge problem down to one little char. Once you get it fixed, it’s FIXED and it’s satisfying.
November 24th, 2009 at 3:35 am
I once managed to insert a non-printing character into a perl script by hitting Shift+Space (invoking SCIM) instead of Space. Took a long time to find the problem.
November 24th, 2009 at 9:03 am
Thanks for sharing! Felt like an adventure reading this, although I had a vague recollection of where the tale was leading since I dipped my toe into the video conversion world in the mid 1990s just long enough to experience some of your frustration with the numerous formats. Thing are hopefully/probably better now, but then it felt like a wasteland of incompatible, competing formats.
Oh, and speaking of all-day simple bugs. In the early 90s, I once spent an entire day on a single line of code in a 3D rendering engine. The code seemed to change its output depending on what called it. Turns out it was worse than that — output depended on whether it was before or after a comment! Senior developers thought I was nuts so I reduced the test case to a 5-line program. So this:
void function()
{
i=i+i;//comment
}
produced different output from this
void function()
{
//comment
i=i+i;
}
Yeah, no kidding, that was weird. Turns out it was a bug in the compiler (we were using Microsoft’s shiny new C++ compiler, still in Beta) causing i=i+i to produce unpredictable results. Filed a bug report on the phone (this is pre-WWW, Windows 2.1 days). Good times.
November 24th, 2009 at 9:54 am
It seems to me that you have now regressed to performing a (albeit simpler) colorspace transform inside ffmpeg, which I thought it was your original goal to avoid . . . admittedly it should be done by nice fast code, but it still involves an unnecessary full frame duplication, which could be a pain if you care about the cache footprint.
November 24th, 2009 at 10:18 am
It happened in my life too. really nice post. thanks 4 sharing it.
November 24th, 2009 at 11:13 am
Interesting story, I once spent about a day and a half trying to make a custom jabber client work, only to find out that one character that was supposed to be capitalized was lower case, it took so long as I kept looking at what appeared to be identical logs – but one worked and one didn’t. So I definitely feel your pain.
November 24th, 2009 at 12:53 pm
I once spent a day looking for what turned out to be the difference between “K” and “k”.
I also once helped two friends who had been debugging a web server configuration for several hours. They called me over, I looked at a few of the config files (this was httpd 1.3 or 1.4) and then said — uh, could you put a ENTER right *there*? They went “huh, why???”, and I said, “just do it please…” and everything worked.
It was the very last line of the file, and their parser required a newline (or maybe CR, I forget) to be at the end of the line in order to recognize the line should be processed. Doh!
November 27th, 2009 at 8:07 am
[...]one another relavant source of information on this topicis ,lbrandy.com,[...]
February 4th, 2010 at 2:31 am
The journey is very nice for single charter. Example (j = 1; j<20; j+++)that is single charter but j is very imp for 20 number print.
November 6th, 2011 at 6:59 am
Geras Organic Olive Oil 750ml
GERAS olive oil owes its unique quality to the particular conditions prevailing in Lesvos groves and comes from Kolovi and Adramitiani olives, the varieties flourishing in the bay of Geras. Rich in antioxidants, vitamins, beneficial polyphenols.
€12.00 As low as: €10.00
olive oil sprayPregnancy Forum
November 7th, 2011 at 1:28 am
affordable real estate agent websiteProfessional Recruitment For Candidates
Our Realty solutions
are “easy” and “affordable” for Real Estate Agents
We make it easy for any agent to get onboard and focus on selling with our RealtyAgent Open Listings Management – FREE Real Estate Website.
Call 1 (866) 967-0982, Talk to a Representative and Get Started
{ Visit demo.kylarealtysolutions.com }
November 8th, 2011 at 1:03 am
About Top Gun Replica Watches
We have thousands of cheap replica watches for sale, they are best replica watches, we only keep high quality replica watches on our website, so we don’t have many hot models which on other website but low quality, our luxury top quality high end replica watches just exact like what your see on our pictures, you can buy any mens and ladies watches replicas from us, you will get pefect wrist watches, all them will make you happy.
homehome
December 31st, 2011 at 2:59 am
remote controlled toysFamosos
Descubra com qual celebridade você se parece
Você sempre se achou parecido com alguma celebridade, mas nenhum dos seus amigos ou familiares têm a mesma percepção? Então mostre para eles com quem você se parece com o…
December 31st, 2011 at 11:58 am
alarm systems for apartmentschatten
10 Home Security Systems Benefits To Keep In Mind
Home security systems are the top most requirements for safety purposes. There is a boom in this industry that has allowed home security companies to flourish and it gives many options to the customers to choose from.
January 1st, 2012 at 9:43 pm
yachtcrewκατασκευή ιστοσελίδας
Maritimworld.com is a job portal that provides something different. The reason is quite simple.
Developed and maintained by ‘First Solutions’ a Norwegian Company.
January 1st, 2012 at 9:48 pm
Welcome to Banner Stand Pros . . .
The Superstore for Banner Stands!
You’ve arrived at the best spot on the internet for banner stands. Here you’ll find the largest selection of retractable banner stands (also called roll up or pull up stands), portable models, which are the non-retractable styles, as well as models for trade shows, retail or outdoor use, accessories, replacement parts, shipping cases and more. Our huge selection means you’re sure to find exactly the right display or accessory for your needs, and our large volume means we can offer the best prices in the industry.
banner display standsAustralia Shoes
January 1st, 2012 at 10:30 pm
Please join us in enjoying the luxurious lifestyle that Naples, Florida has to offer—–beautiful tropical weather with warm breezes off the Gulf of Mexico. Fine dining, world class shopping, championship golf and boating are a part of the wonderful lifestyle that Naples has to offer.
Quail WestRYA Course
January 2nd, 2012 at 9:55 pm
bedsresort Fraser Island
Bed and Mattresses for a comfortable night sleep from Absolute Beds London
The best deal bedroom furniture store where you can find bespoke beds and mattresses exclusive beds frames at best and cheap prices in London and UK.
January 2nd, 2012 at 10:01 pm
celebrity newslocation de salle lyon
Is Sonic Youth breaking up? If so, we will need to take a break from gossiping for a few hours to weep and smash furniture.
January 3rd, 2012 at 2:03 am
Air Conditioning Service NYC
Five Borough AC Is The Air Conditioning Experts For Residential And Commercial Units. We Are One Of The Leading HVAC Service Providers In NYC.Our Respond To Your Air Conditioning Repair Needs Will Be Fast And Professional. Our Goal Is To Make Sure Your Air Conditioner Will Preform At It Best. Our Air Conditioner Repair Mechanics Are EPA Certified And Provide Top Customer Service.
air conditioners nycNAIL
January 3rd, 2012 at 8:13 am
Featured Product Freaky Cousin Outdoor Lime
$145.00
The Freaky Cousin bean bag’s vibrant colour and unique shape perfectly compliments any lounge room, childs play area or bedroom. DIMENSIONS 105 x 77 x 55 Recommended beans 220 ltrs Polyester 420 D with PVC Coating Be careful with the shocking color and shape effect of the Freaky Cousin.
outdoor bean bagswedding shower invitation
January 11th, 2012 at 11:17 am
doormats with sayingsst lucia holidays
Serve the Lord MatMates™
This Serve the Lord MatMates™ pattern welcome mat
January 13th, 2012 at 7:45 am
Saving money on ink cartridges is an easy way to lower expenses
We’re a Proud Member of the BBB ASAP Inkjets offers discount inkjet cartridges and laser toner for Epson, Canon, HP, Lexmark, Brother and many other printers at enormous savings.
inkjet print cartridgesMusic for Documentaries
February 1st, 2012 at 9:56 am
Epson M2400 Tonerpower tab files
Compatible Epson M2400 Toner
Our compatible Epson M2400 toner fits the VOSA supplied MOT printers and gives high-quality print for a lot less cost than the Epson originals.
February 1st, 2012 at 10:50 am
holidays to antiguadata cabling engineer
Antigua Weather
As one would expect, Antigua boasts a very generous climate all year round, with only a slight dip in temperature during the winter months.
February 3rd, 2012 at 9:14 am
For many years, we have been dedicated to finding the most talented, up-to-date and qualified dentists to meet your dental needs. Every dentist and office hosted on 18004SMILES.com has met these qualifications.
dentist in san diegoизготовление пластмассовых изделий