Wednesday, June 15, 2011

Putting an end to the "Log Wars"

A long time ago, in a laboratory far far away there was a lowly FACScan able to display data on a 4-log scale.  Fast-forward to today, and you'll find some instruments with as many as 7 logs of scale.  That's a huge improvement, right?  Well, maybe not.  The origin of the 4-log scale probably had more to do with the Analog-to-Digital Converters (ADC) being used than the technological needs of the science being done in the 80s.  With the advancements in ADCs in other markets, flow cytometry manufacturers could now include converters with greater bit density and still provide a relatively affordable product.  The standard for many years was the 10-bit ADC, which yields 1024 bins of resolution across the scale.  Spreading these 1024 bins across a 4-log scale appears to give enough resolution while expanding to a reasonable range.  After many years using these solid electronic components, BD completely redesigned the electronic system on its cell sorter (called the BD FACSVantage) to give us the FACSDiVa (or Digital Vantage) architecture.  Now, instead of using traditional ADCs and log amplifiers, BD switched things up by using "high" speed Digital Signal Processors (DSPs) to directly digitize the analog pulse and then do log conversion using look-up tables.  The DSPs converted the linear data at a bin density of 14-bit (16,384 bins) and when the data is log converted, it is upscaled to 18-bit (262144 bins).  Now, with 18-bit data, they are able to display this data on a 5-log scale.  The reason?  Well, if I were forced to guess, I'd say it was a marketing decision to differentiate BD's new line of cytometers from it's old line as well as it's competitors.  With this new 5-log data came with it the "picket fencing" phenomenon, which basically demonstrated that the 18-bit data (which was really 14-bit data) did not have enough bin resolution to display data properly in the 1st decade.  The solution?  Simple, hide the 1st decade and display decades 2 through 5 (right back at a 4-log scale).  Because the BD instruments were so popular, other companies jumped on the bandwagon and thought, well if BD is doing 5-logs then we should do 6-logs or maybe 7-logs.  And that's how we arrived here today, and now I'd like to show you why this is a bad thing.

Let me start with my conclusion first, and then show you how I arrived here.  The figure to the right shows a minimun analog to digital conversion bit density for a given range of log scale.  As you can see, if we wanted to display our data on a 5 log scale, we should have at least a 20-bit ADC. Side note - Bit(eff) means Effective Bit density, which basically takes into account that if you put a 20-bit ADC on your instrument, it probably doesn't actually perform at a full 20-bit.  This is because there is some noise associated with the ADC, which limits the performance of the ADC. /Side note.

So, how did I arrive at this conclusion?  Well first let me demonstrate that bit-density is important with an example.  I created a mock data set of 3 Gaussian distributions (n=1000 data points for each) where the mean of the distributions and the SD were altered such that the populations were overlapping significantly.  I then plotted these distributions on 4 histograms with different quantities of bin resolution ranging from 3-bit to 8-bit.  It's important to remember that this is the exact same data set merely binned differently according to the available resolution.  As you can see, the 3 populations are not at all discernable at the 3-bit range and it's not until we get to the 6-bit histogram that you can start to see the 3 different populations.  Using this information, we can appreciate the importance of having sufficient bin density to resolve distributions from one another.

As an example to a system that might not have enough bin density, I display the following.  Here we have a 20-bit ADC yielding over 1 million bins of resolution to spread across a 6-log scale.  This may sound sufficient, but when we break it down per log, we see that in the first decade, where we have scale values of 1-10, we would only have 11 bins of resolution which would certainly lead to picket fencing and poor resolution of populations in that range.  The Effective bins column shows an example where the noise of the ADC is such that our true bin resolution would be much less than the theoretical 20-bit.

Going through the process and crunching numbers for different scenarios, I conclude that ideally we would like to have on the order of 100s of bins of resolution in the 1st decade.  So, in order to achieve that level on a 6-log scale, we'd actually need to have an 24-bit ADC.  Now, the breakdown would be like what's shown below.  

Take-home message:  First of all, is a 6-log scale really necessary?  For you the answer may be yes, but for most, probably not.  The second question to ask your friendly sales representative is what sort of analog-to-digital conversion is done, and what the bit resolution of the converter is.  It means nothing to have a 7-log scale displaying data from a 10-bit ADC.  No matter how good the optics are you'll never be able to resolve dim populations from unstained cells.  What really matters is having a really good optical system that has high speed, high density electronics that can display all the fine detail of your distributions.  Find an instrument like that, and you have a winner.


7 comments:

  1. is this a blog post or a frig'n article. sheesh. :)

    ReplyDelete
  2. Twitter = short quips and links
    Facebook = ideas and notifications
    Blog post = more in-depth analysis

    ReplyDelete
  3. great post! So... what instrument (from those commercially availbale) approaches that ideal machine?

    ReplyDelete
  4. I believe there's an element missing from your analysis. The histogram your software displays has a resolution, and that factors into whether you see picket fencing. (You use the word "bin" to refer to one possible value from the A/D converter. That's correct, but I'm going to use "channel" to distinguish from the bins in the software's histogram.)

    A cytometer that returns a 20-bit value gives you values that range from 0 to 1,048,575. To display this full range (without scaling) requires scales that go from 0 to 6.02 decades.

    To display this on a log scale, the software scales the values using log() and then assigns them to a histogram bin. The number of bins per histogram is configurable in most software.

    Picket fencing occurs when there are histogram bins which cannot be chosen for any input value to the log() function. Because the log() transform spreads out the channels at the low end, and compresses them at the high end, the closer you get to zero the fewer channels there are per histogram bin. If you use a lower resolution histogram, you’ll have more channels per bin, and the picket fencing artifact will be less frequent.

    For example, for our 20-bit cytometer, we can set the lower limit to the following values for the corresponding number of bins and completely avoid picket fencing:
    Bins Low Dec. Disp.Dec
    1024 10^2 4.02
    512 10^1.7 4.32
    256 10^1.25 4.77
    128 10^0.8 5.22
    64 10^0.44 5.58

    So can you display 20 bit data on a 5 decade display without picket-fencing? Yes, but only if you use fewer than 256 bins in your histogram.

    Are you going to encounter picket fencing on a six decade display? Yes, unless you have a lot of bits per value, or you use relatively low resolution histograms.

    How many bits are enough? How bright are your signals? How many bins do you usually use in your histograms? Then you’ll have a ballpark idea of whether you’ll encounter picket fencing.

    ReplyDelete
    Replies
    1. Awesome, thanks Ernie for the additional info.

      Delete
  5. If you ever get curious enough, let me know. I have an Excel spreadsheet that will tell you which decades might exhibit picket fencing given a minimum and maximum axis ranges, for different histogram resolutions.

    ReplyDelete
    Replies
    1. Hello

      Very useful your information, I am working on the standardization of a 24-bit flow cytometer, taking as a pattern an 18-bit cytometer, I am interested in the excel you mentioned could you please share it.

      Thanks,

      Delete