Anti-sophist
Graduate Poster
- Joined
- Sep 15, 2006
- Messages
- 1,542
I gathered up all the publically available flight-data-recorder information, looked at it closely. My initial intent was to properly analyze the data and debunk the variety of dopey conspiracy theories that abounded. After reading all of the NTSB reports, looking carefully at the data provided about the hardware, and the CSV file, I realized that virtually all conspiracy theorist attempts at using this data for sub-second accurate reconstruction is completely and utterly baseless. In the words of Pauli, paraphrased, they aren't even wrong.
What follows is a copy/paste of the bulk of a longer doc file I've been writing. This details, specifically, what flight data recorder data looks like, how it is recorded, how it is decoded, and what the CSV file flying around actually is (and how it was made). I'd really appreciate a proofread and some constructive criticism on any gaps. The full document contains some examples to illustrate teh concepts, but the tables don't translate very well, so I've gotten rid of several examples and paragraphs to do with those examples.
Contained in this document is a pretty thorough description of all the sources for error that pop up when using the CSV file as a "raw" fdr data output, and I explain how the real "fdr" data has few of these problems. I don't actually debunk any specific claims (ie, JDXs), entirely because almost all of the flaws in the analysis are simple and trivial to point out given a thorough understanding of what the CVS file is.
The meat of the paper is section 3. Sections 1 and 2 are scientific background and descriptions of the various technical aspects.
----
About Me:
MS Electrical Engineering, worked with the USAF (as a civilian) on F15s doing data recording and telemetry. I've designed, built, tested, installed, and maintained flight data acquisition systems, of which the FDR is a very low-bit-rate version. It also has the unique characteristic, among data recorders, of being crash survivable.
-------------
I. Recorded Flight Data Format
The recorded flight data is serial binary data. So that means the guy who sat down with Flight 77s data recorder, put the tape into a computer, a single wire as the input, and across that wire comes a series of bits: 1,1,1,0,0,1,0,1,1,0,0,1.
When you consider the problem of sitting at a computer, and seeing a serial stream of 1s and 0s come in, and trying to make sense of it, you will begin to realize the engineering difficulties in making this process work well. Well, the first (and most familiar) is to break up the signal into bytes (8 bytes), or other units of length (the FDR on flight 77 uses words of 12 bits in length, instead of bytes). Throughout the document, I will refer to “words” which simply means a predefined number of bits. For Flight 77, specifically, it means 12 bits, however the logic below will apply to any number of bits per word.
The next major abstraction is to frames. A frame is a specific group of words. On Flight 77, the frame length was 256 words. In order to correct for errors, each frame has specific “synch” words that are used to keep the data-processing software “in synch”. Every 256 words, the recorded inserted a known “synch” word. This synch word, literally, is used to keep the data-processing in synch, and help correct for errors.
All frames are exactly the same length, with the known synch words in the exact same places. For this reason, when you are receiving data from a data recorder, you would know there is supposed to be 2000 bits between synch words, and so if the current frame you’ve received only has 1999 bits between synch words, you would know that a bit has been dropped (this happens more often than you’d think). The question becomes: “Ok, we dropped a bit… but from where?” Chances are high only one of your words is corrupted (11 bits instead of 12), but it’s impossible to know which one, so you are forced to throw out the entire frame. (Please keep that thought in mind when conspiracy theorists talk about “partial frames”).
Often times, frames of serial data are structured even further into “major frames” and “minor frames”. A major frame is simply a collection of minor frames, and it’s done almost always for convenience. Flight 77 has major frames that are 4 seconds long, and it is broken down into four 1-second minor frames, each consisting of 256 12-bit words.
All frames have time stamps. Since each frame represents an exact amount of time, the recorded time of any single word can be calculated by its position in the frame and the time-stamp of the frame. If a word is exactly half way into a frame, it’s time of recording was exactly halfway between this frame’s timestamp and the next one.
There are some important issues to note about recorded:
1)The amount of data flowing is always constant. There are exactly N bits in T time, never more, never less (if you were designing a system, you’d put filler words in to make sure of this).
2)If there are too many, or too few bits, between a pair of synch words, it’s virtually impossible to tell which data is corrupt, and almost always all of this data is thrown out.
3)Data, relative to the frame, is always recorded at the same time. If your frame period is 1 second and you set it to record the altimeter at 0.3 seconds, it’s going to be in the data stream at 0.3, 1.3, 2.3, 3.3, 4.3, etc. Each piece of data gets his one shining moment specially reserved for him, so he had better be ready to go at that moment.
[FONT="]
[/FONT] II. On-Aircraft Recording Systems
A recording system, then, is a system that samples data from around an aircraft, compiles it somehow into a fixed bit-rate serial data stream, and sends this data to a recorder. First, let’s discuss the type of data recorded on an aircraft. There are two main sources of data, as far as the recording system is concerned:
1)Data the recorder can access at any time (almost always analog sensors)
2)Data that the recorder is told, at certain, unpredictable, times (almost always digital information from a computer).
An accelerometer is an analog sensor. The recorder can read the accelerometer at any time. So if the accelerometer data is programmed to be recorded at 19.723 seconds, the recorder can read the accelerometer at 19.721 seconds, get the answer, and store it. This means, for data of type I, the time the data is recorded is virtually identical to the time it is measured. This is the key distinction between the two types of data.
This time the data will come from a digital source like, for example, an Air Data Computer (ADC). The ADC might compute the air speed 5 times per second, along with helping the pilot fly the plane. It’s much smarter (and safer) to let the ADC say to the recorder “Here is the computed airspeed” when it isn’t busy. The opposite approach would be to let recording system interrupt the ADC and say “Give me the airspeed”. The ADC might be doing more important things during this time, and giving out data to the recorder might not be the highest priority. In almost all situations, the first method is preferred: The device, when it’s ready, sends the data to the recording system.
This fundamental design decision has serious implications. The airspeed data might not be programmed to go out in the serial data stream until 0.75s (remember, this data is stored at a specific time), but the ADC has informed the recorder at 0.3s of his airspeed. The recorder unit must be able to receive this information from the ADC or other digital sources, and store it, until it is time to record it. Recording systems all employ some type of digital buffering, so that they can receive and hold information until that particular piece of data gets his turn to be recorded.
Be mindful, that this introduces an error. If the data was measured at 0.3s, and recorded at 0.75s, our poor software engineer who is decoding it later will think it was measured at 0.75s. This problem is generally solved by reserving space in the data-stream for time-stamps of the data. In other words, word 3 might be for the computed airspeed, and word 4 might be for the time-stamp that the computed airspeed was measured. In this way, the actual altitude signal might be recorded at time 1.7, but the timestamp will tell us it was measured at 1.3. It’s very important to understand that when this type of data was recorded does not indicate when it was measured. You need this timestamp information to do know when it was measured. Flight 77 raw's FDR data probably has these timestamps, but the CSV file does not.
Hardware and Terminology
All flight data recording is split into 3 distinct, logical, components. Modern recorders contain all 3 modules in a single box, but if you were to crack it open, and look at the design, you’d see three very distinct components:
1)DAU: Data Acquisition Unit: The DAU is responsible for buffering all digital data (and timestamps of when it was measured), and having all analog data sampled and ready to go. Basically, you can think of the DAU as the RAM or memory of the recording system.
2)Controller: The controller is responsible for executing the program (the frame). Basically he follows the tabular chart. If we are at word 1, we tell the DAU “Send Major Synch to the Recorder”. Wait for the word to finish sending, and then tell the DAU “Send the Time Stamp to the Recorder”, and so on.
3)Recorder: Obviously receives a stream of data and stores it to some medium. The actual recording medium used in the FDR of Flight 77 was “solid-state”. It’s a fairly new recording technology and a large improvement over the older methods (magnetic tapes).
[FONT="]
[/FONT] III. Flight 77s Flight Data Recorder
The vast majority of information about the Flight Data Recorder (FDR) that is found in the public domain is the result of FOIA (Freedom of Information Act) requests to the NTSB. The full versions of all the reports are available here: http://www.ntsb.gov/info/foia_fri.htm
Beyond the FDR report, from the NTSB, there are two released attachments:
1)The .fdr file which contains the actual data-dump from the FDR
2)A CSV file which contains processed FDR data
Some technical details about the serial data on the FDR:
This output is a continuous sequence of four-second frames. Each frame consists of four subframes of 256 separate 12-bit words, with the first word containing a unique 12-bit synchronization word identifying it as subframe 1,2,3 or 4.
The FDR Raw Data File
A raw data file is only useful if also given the frame description, which describes the synch words, and the location inside of each frame for each recorded value. This frame-description was included in the NTSB report, according to the NTSB Flight 77 FDR report, page 2, footnote 1:
Based on Boeing’s 757-3B data frame. See Attachments IV and V. Boeing Document D226A-101-3, Rev. G, October 27, 1999 (D226A101-3G.pdf); American Airlines database printout (757-3b_1.txt)
To the best of my knowledge, none of these documents exist in the public domain, and were not released with the FOIA request. Without the frame description information, the raw data is almost entirely useless.
The only issue is to what extent can this file be reverse engineered, and what useful data can come from it. First, and most importantly, I am not sure if this data file has been uncompressed. The Flight 77 FDR report mentions (page 3) that specific software is necessary to uncompress the data. If the data in this file is compressed, then there is virtually nothing useful to be gained, without first uncompressing it. Given a brief look at the header of the raw file, it appears to contain plain-text, which would imply it was not compressed data.
Under the assumption it is uncompressed, already, I will speculate, briefly, on the potential gain from reverse engineering it. First, it’s very likely that someone with minor amounts of effort could figure out the synch words, and extract the major and minor frames in raw format. In this sense, you could get “frame lock”. You’d be able to align all the data between frames. This may be useful in determining the number of frames, or the state of the final few frames. Extracting any information, beyond that, would be incredibly difficult to pull off successfully.
The CSV file
The CSV (Comma Separated Values) is the main data source used by all Flight 77 “amateur” forensic analysis. First, we will discuss the reason for this files existence. This can be found in the NTSB Flight 77 FDR report, page 4:
Attachments II-1 to II-17 contain plots of those parameters that were recorded and validated. The timeframe of the plots is from 8:19:00 EDT to 9:39:00 EDT, with the last recorded data occurring at 9:37:44 EDT.
The data plotted in Attachment II are available in tabular format as a comma delimited (*.csv) file in Attachment III.
It is very important to note that the FDR data in the CSV file was processed to be made able to be plotted. I will explain what the purpose of this processing is, and what the effect of this processing had on the information contained in the CSV file.
The single most important point of this entire document is to be found here: These frames represent just that: frames. The vertical axis represents the number of samples taken during that frame, not the time those samples were taken. Many conspiracy theorists, incorrectly, believe that each row of this file represents 1/8 of a second.
The flawed interpretation is quickly disposed of by realizing a few key pieces of evidence. First of all, if you look at the longitudinal acceleration data above, you will see that it is sampled 4 times, and then the other 4 rows are blank. Without getting into the technical details, sampling at 0, 1/8, 2/8 and 3/8, and then not sampling again until 8/8 is absolutely silly. In digital signal processing, sampling out-of-phase like this would result in horrible aliasing effects and poorer reconstructed signal quality. It requires the same amount of effort, and the same amount of bandwidth to sample in equally spaced intervals, and the data is far superior. There is absolutely no way that the data was sampled “out-of-phase” like the incorrect interpretation would imply.
The second major clue is that our serial multiplexed signal is a constant bit-rate signal. This means that the same amount of data flows during the same period of time, at all times. All data points in this file are squished towards the top of the frame. This would mean much more data has to travel out from 0 to 1/8 then has to travel from 6/8 to 7/8. This violates the principle of constant bit-rate.
The key point is worth reemphasizing, so I will do it again: the proper interpretation of each frame of in the CSV file is that N samples were taken during that second. We know nothing about the time of these samples other than the fact they were taken during the frame, and are equally spaced. Pressure altitude could have been sampled at 0.0 or 0.99, and they would both show up exactly the same in the CSV file.
This means we can calculate an error rate, in time, for each data point, due entirely to not knowing where in the frame this particular data-point was recorded. For a data-point sampled at 1Hz, like pressure altitude, that sample could have occurred at any point from 8:19:00 to 8:19:01. This is an error range of 1 second. A similar calculation can be done to be show that the maximum error range is equal to the time period between samples. Samples done at 8Hz have an error range of 0.125s, and 4Hz has 0.25s, and so on.
Also note that the timestamps of the major frames have been processed from the original data (the NTSB FDR report mentions this on page 3). There is no way to know the error in these timestamps, nor do we know the precision. It is a mistake to try to correlate these timestamps with the outside world (like official time of impacts).
The Final 1 to 2 Seconds
Given that the data was compressed, and synched, it’s very likely that any frames that were not complete would be difficult to recover, if even recoverable at all. The implication of this is quite simple, and that is the FDR data in the CSV file “runs out” well before the plane actually hits.
This means that 9:37:44 was the last, complete frame, gathered by the recorder. That puts the likely time of impact in the 9:37:45-6 range, and possibly even into the 9:37:46-7 timeframe. The presence of 9:37:46 in this data suggests that its timestamp may have made it onto the tape. How is that possible if 9:37:45 is not a complete frame? That’s a good question, but a reasonable hypothesis has to do with the storage mechanism used. Solid State Recorders, like all medium, are quite unpredictable if they fail during write operations. The actual area being used to record data can very easily be corrupted if power fails while writing. It’s plausible that the crash caused problems in and around this local area of data, causing corruption of the 9:37:45 data frame (again, changing a single bit in a synch word is enough to cause software to completely choke).
The moral of the story here is that the FDR data runs out anywhere from up to 2 seconds before the plane actually crashed into the Pentagon.
Pitfalls Using the CSV File to Reconstruct the Flight of Flight 77
Please keep in mind the CSV file is not raw FDR data, and it was not meant to be used forensically. As such, much information that is present in the raw FDR is lost. Using the CSV file for this purpose is not what it was intended for, however that does not make it impossible. Any analysis done using the file must successfully justify it’s correctness despite the following errors:
0) Absolute Time Error
The times calculated in the file were processed after the fact by the NTSB analysts. This information can be found in the NTSB FDR report, on page 3. We have no idea what the precision or accuracy of the original time stamps is. Any attempt to correlate FDR times to non-FDR (“real world”) times is flawed from the get-go, given this level of uncertainty. It’s probably safe to assume +/-2 seconds of error in the absolute time. This has no effect on relative time (e.g., if it says two frames are 19 seconds apart, chances are good they are exactly 19 seconds apart, to several decimal places).
False Claim: The FDR couldn’t have recorded 9:44:46, the official time of the crash is 9:44:45!
1) Instrument Error
First, and foremost, if the sensor or instrument is giving the recorder a bad number, it’s obviously not going to record the right one. This type of error must be dealt with on an instrument by instrument basis. Any reconstruction should justify the precision used for each value attained. Please keep in mind that all other errors in this document are due to the recording system, and the lost information in the processed version of its data. The uncertainty caused by the data scheduling into a frame, plus the digital buffering, is in addition to any instrument errors.
False Claim: I already debunked the lagging altimeter nonsense!
(The recording-system errors discussed in this document are in addition to, and independent of, any instrument errors.)
2) Intra-frame Time Error
Since we do not have the frame descriptor, all we know is that N samples are taken during a 1 second period. This means that 1/N of a second is the possible error range for a particular data point. With the frame descriptor, this error would be completely removed if using the raw FDR data.
False Claim: The aircraft’s speed at 09:37:14.00 was 305.5 knots!
3) Digital Buffering Latency
One of the most important purposes of the DAU is to buffer digital outputs from things like the ADC (Air Data Computer). It is a reasonably safe assumption that the Air Data Computer updates the DAU at least once per sampling, and more than likely twice. This means that for a 1-Hz sample, recorded into the data stream, the actual measured time could have been anywhere in the entire previous second. This means, combined, a digital reading in the CSV file, like Computed Airspeed, which comes from the Air Data Computer, has an enormous error range, in the vicinity of 2 seconds, although 1.5 seconds is probably a safe estimate (0.5s for the buffering latency, and 1s for the uncertainty of when the sample was actually recorded). More then likely, the raw data stream has embedded the actual measurement times, so this error might be completely removed using the raw FDR data.
False Claim: The worst case scenario for the 9:14:14 frame’s airspeed is 9:14:14.00, then!
(Yes, that is the worst case time it could be recorded… not measured).
4) Simultaneity Issues
You cannot assume any two samples occurred at the same time. Any analysis that combines two columns of numbers is risking using numbers that did not happen at the same moment in time, for a calculation that assumed they did.
False Claim: The altimeter data shows you’d need positive acceleration to hit the light poles, the accelerometer is showing negative acceleration! (Did you account for the +/- 2 seconds, potentially, between those two separate data points?)
The Bottom Line on Error
Two variables sampled at 1Hz will appear on the same line in the CSV file, however they have a total, combined, error range of nearly 3 (or even 4) seconds. How is that possible? Let me walk you through it.
In this example we consider two samples, A and B, both sampled at 1Hz, during the time frame between second 1.0 and 2.0:
Sample A:
Measured at 0.5s ADC->DAU (buffered)
Recorded at 1.0s DAU->REC (recorded)
Sample B
Measured at 1.98s ADC->DAU
Recorded at 1.99s DAU->REC (recorded)
Please note that sample A and Sample B will both appear on the same line in the CSV file, as they are both part of the frame with timestamp 1.0 second. In this example, Sample A and B are on the same line of the CSV file but were measured 1.5 seconds apart. The entire method could be reversed for A and B, giving a total error between the two of +/- 1.5 seconds, for a full error range of 3 seconds.
Summary
1) The FDR did not record the final moments of Flight 77. There is up to 2 seconds missing.
2) The CSV file is not meant to be analyzed forensically, it is meant to be plotted.
3) The CSV data is not raw FDR data. It is not even serial bitstream data.
4) The CSV data is not meant to be broken down into 1/8th seconds and analyzed.
5) The CSV data, properly interpreted, says that there are N samples during this particular frame.
6) Without the frame description, we do not know when in a frame any one sample occurred.
7) Without the frame description, we have lost the measurement timestamps, so the time a particular word was recorded does not necessarily equate with when it was measured.
8) Given these time-shift errors, any mathematics that uses more than one data-point runs the risk of assuming that two numbers occurred at the same time, when they didn’t.
9) Many of these errors can be corrected, greatly, with the frame descriptor.
10) Any analysis must account for (or justify ignoring) these issues in order to draw any valid conclusions.
What follows is a copy/paste of the bulk of a longer doc file I've been writing. This details, specifically, what flight data recorder data looks like, how it is recorded, how it is decoded, and what the CSV file flying around actually is (and how it was made). I'd really appreciate a proofread and some constructive criticism on any gaps. The full document contains some examples to illustrate teh concepts, but the tables don't translate very well, so I've gotten rid of several examples and paragraphs to do with those examples.
Contained in this document is a pretty thorough description of all the sources for error that pop up when using the CSV file as a "raw" fdr data output, and I explain how the real "fdr" data has few of these problems. I don't actually debunk any specific claims (ie, JDXs), entirely because almost all of the flaws in the analysis are simple and trivial to point out given a thorough understanding of what the CVS file is.
The meat of the paper is section 3. Sections 1 and 2 are scientific background and descriptions of the various technical aspects.
----
About Me:
MS Electrical Engineering, worked with the USAF (as a civilian) on F15s doing data recording and telemetry. I've designed, built, tested, installed, and maintained flight data acquisition systems, of which the FDR is a very low-bit-rate version. It also has the unique characteristic, among data recorders, of being crash survivable.
-------------
I. Recorded Flight Data Format
The recorded flight data is serial binary data. So that means the guy who sat down with Flight 77s data recorder, put the tape into a computer, a single wire as the input, and across that wire comes a series of bits: 1,1,1,0,0,1,0,1,1,0,0,1.
When you consider the problem of sitting at a computer, and seeing a serial stream of 1s and 0s come in, and trying to make sense of it, you will begin to realize the engineering difficulties in making this process work well. Well, the first (and most familiar) is to break up the signal into bytes (8 bytes), or other units of length (the FDR on flight 77 uses words of 12 bits in length, instead of bytes). Throughout the document, I will refer to “words” which simply means a predefined number of bits. For Flight 77, specifically, it means 12 bits, however the logic below will apply to any number of bits per word.
The next major abstraction is to frames. A frame is a specific group of words. On Flight 77, the frame length was 256 words. In order to correct for errors, each frame has specific “synch” words that are used to keep the data-processing software “in synch”. Every 256 words, the recorded inserted a known “synch” word. This synch word, literally, is used to keep the data-processing in synch, and help correct for errors.
All frames are exactly the same length, with the known synch words in the exact same places. For this reason, when you are receiving data from a data recorder, you would know there is supposed to be 2000 bits between synch words, and so if the current frame you’ve received only has 1999 bits between synch words, you would know that a bit has been dropped (this happens more often than you’d think). The question becomes: “Ok, we dropped a bit… but from where?” Chances are high only one of your words is corrupted (11 bits instead of 12), but it’s impossible to know which one, so you are forced to throw out the entire frame. (Please keep that thought in mind when conspiracy theorists talk about “partial frames”).
Often times, frames of serial data are structured even further into “major frames” and “minor frames”. A major frame is simply a collection of minor frames, and it’s done almost always for convenience. Flight 77 has major frames that are 4 seconds long, and it is broken down into four 1-second minor frames, each consisting of 256 12-bit words.
All frames have time stamps. Since each frame represents an exact amount of time, the recorded time of any single word can be calculated by its position in the frame and the time-stamp of the frame. If a word is exactly half way into a frame, it’s time of recording was exactly halfway between this frame’s timestamp and the next one.
There are some important issues to note about recorded:
1)The amount of data flowing is always constant. There are exactly N bits in T time, never more, never less (if you were designing a system, you’d put filler words in to make sure of this).
2)If there are too many, or too few bits, between a pair of synch words, it’s virtually impossible to tell which data is corrupt, and almost always all of this data is thrown out.
3)Data, relative to the frame, is always recorded at the same time. If your frame period is 1 second and you set it to record the altimeter at 0.3 seconds, it’s going to be in the data stream at 0.3, 1.3, 2.3, 3.3, 4.3, etc. Each piece of data gets his one shining moment specially reserved for him, so he had better be ready to go at that moment.
[FONT="]
[/FONT] II. On-Aircraft Recording Systems
A recording system, then, is a system that samples data from around an aircraft, compiles it somehow into a fixed bit-rate serial data stream, and sends this data to a recorder. First, let’s discuss the type of data recorded on an aircraft. There are two main sources of data, as far as the recording system is concerned:
1)Data the recorder can access at any time (almost always analog sensors)
2)Data that the recorder is told, at certain, unpredictable, times (almost always digital information from a computer).
An accelerometer is an analog sensor. The recorder can read the accelerometer at any time. So if the accelerometer data is programmed to be recorded at 19.723 seconds, the recorder can read the accelerometer at 19.721 seconds, get the answer, and store it. This means, for data of type I, the time the data is recorded is virtually identical to the time it is measured. This is the key distinction between the two types of data.
This time the data will come from a digital source like, for example, an Air Data Computer (ADC). The ADC might compute the air speed 5 times per second, along with helping the pilot fly the plane. It’s much smarter (and safer) to let the ADC say to the recorder “Here is the computed airspeed” when it isn’t busy. The opposite approach would be to let recording system interrupt the ADC and say “Give me the airspeed”. The ADC might be doing more important things during this time, and giving out data to the recorder might not be the highest priority. In almost all situations, the first method is preferred: The device, when it’s ready, sends the data to the recording system.
This fundamental design decision has serious implications. The airspeed data might not be programmed to go out in the serial data stream until 0.75s (remember, this data is stored at a specific time), but the ADC has informed the recorder at 0.3s of his airspeed. The recorder unit must be able to receive this information from the ADC or other digital sources, and store it, until it is time to record it. Recording systems all employ some type of digital buffering, so that they can receive and hold information until that particular piece of data gets his turn to be recorded.
Be mindful, that this introduces an error. If the data was measured at 0.3s, and recorded at 0.75s, our poor software engineer who is decoding it later will think it was measured at 0.75s. This problem is generally solved by reserving space in the data-stream for time-stamps of the data. In other words, word 3 might be for the computed airspeed, and word 4 might be for the time-stamp that the computed airspeed was measured. In this way, the actual altitude signal might be recorded at time 1.7, but the timestamp will tell us it was measured at 1.3. It’s very important to understand that when this type of data was recorded does not indicate when it was measured. You need this timestamp information to do know when it was measured. Flight 77 raw's FDR data probably has these timestamps, but the CSV file does not.
Hardware and Terminology
All flight data recording is split into 3 distinct, logical, components. Modern recorders contain all 3 modules in a single box, but if you were to crack it open, and look at the design, you’d see three very distinct components:
1)DAU: Data Acquisition Unit: The DAU is responsible for buffering all digital data (and timestamps of when it was measured), and having all analog data sampled and ready to go. Basically, you can think of the DAU as the RAM or memory of the recording system.
2)Controller: The controller is responsible for executing the program (the frame). Basically he follows the tabular chart. If we are at word 1, we tell the DAU “Send Major Synch to the Recorder”. Wait for the word to finish sending, and then tell the DAU “Send the Time Stamp to the Recorder”, and so on.
3)Recorder: Obviously receives a stream of data and stores it to some medium. The actual recording medium used in the FDR of Flight 77 was “solid-state”. It’s a fairly new recording technology and a large improvement over the older methods (magnetic tapes).
[FONT="]
[/FONT] III. Flight 77s Flight Data Recorder
The vast majority of information about the Flight Data Recorder (FDR) that is found in the public domain is the result of FOIA (Freedom of Information Act) requests to the NTSB. The full versions of all the reports are available here: http://www.ntsb.gov/info/foia_fri.htm
Beyond the FDR report, from the NTSB, there are two released attachments:
1)The .fdr file which contains the actual data-dump from the FDR
2)A CSV file which contains processed FDR data
Some technical details about the serial data on the FDR:
This output is a continuous sequence of four-second frames. Each frame consists of four subframes of 256 separate 12-bit words, with the first word containing a unique 12-bit synchronization word identifying it as subframe 1,2,3 or 4.
The FDR Raw Data File
A raw data file is only useful if also given the frame description, which describes the synch words, and the location inside of each frame for each recorded value. This frame-description was included in the NTSB report, according to the NTSB Flight 77 FDR report, page 2, footnote 1:
Based on Boeing’s 757-3B data frame. See Attachments IV and V. Boeing Document D226A-101-3, Rev. G, October 27, 1999 (D226A101-3G.pdf); American Airlines database printout (757-3b_1.txt)
To the best of my knowledge, none of these documents exist in the public domain, and were not released with the FOIA request. Without the frame description information, the raw data is almost entirely useless.
The only issue is to what extent can this file be reverse engineered, and what useful data can come from it. First, and most importantly, I am not sure if this data file has been uncompressed. The Flight 77 FDR report mentions (page 3) that specific software is necessary to uncompress the data. If the data in this file is compressed, then there is virtually nothing useful to be gained, without first uncompressing it. Given a brief look at the header of the raw file, it appears to contain plain-text, which would imply it was not compressed data.
Under the assumption it is uncompressed, already, I will speculate, briefly, on the potential gain from reverse engineering it. First, it’s very likely that someone with minor amounts of effort could figure out the synch words, and extract the major and minor frames in raw format. In this sense, you could get “frame lock”. You’d be able to align all the data between frames. This may be useful in determining the number of frames, or the state of the final few frames. Extracting any information, beyond that, would be incredibly difficult to pull off successfully.
The CSV file
The CSV (Comma Separated Values) is the main data source used by all Flight 77 “amateur” forensic analysis. First, we will discuss the reason for this files existence. This can be found in the NTSB Flight 77 FDR report, page 4:
Attachments II-1 to II-17 contain plots of those parameters that were recorded and validated. The timeframe of the plots is from 8:19:00 EDT to 9:39:00 EDT, with the last recorded data occurring at 9:37:44 EDT.
The data plotted in Attachment II are available in tabular format as a comma delimited (*.csv) file in Attachment III.
It is very important to note that the FDR data in the CSV file was processed to be made able to be plotted. I will explain what the purpose of this processing is, and what the effect of this processing had on the information contained in the CSV file.
The single most important point of this entire document is to be found here: These frames represent just that: frames. The vertical axis represents the number of samples taken during that frame, not the time those samples were taken. Many conspiracy theorists, incorrectly, believe that each row of this file represents 1/8 of a second.
The flawed interpretation is quickly disposed of by realizing a few key pieces of evidence. First of all, if you look at the longitudinal acceleration data above, you will see that it is sampled 4 times, and then the other 4 rows are blank. Without getting into the technical details, sampling at 0, 1/8, 2/8 and 3/8, and then not sampling again until 8/8 is absolutely silly. In digital signal processing, sampling out-of-phase like this would result in horrible aliasing effects and poorer reconstructed signal quality. It requires the same amount of effort, and the same amount of bandwidth to sample in equally spaced intervals, and the data is far superior. There is absolutely no way that the data was sampled “out-of-phase” like the incorrect interpretation would imply.
The second major clue is that our serial multiplexed signal is a constant bit-rate signal. This means that the same amount of data flows during the same period of time, at all times. All data points in this file are squished towards the top of the frame. This would mean much more data has to travel out from 0 to 1/8 then has to travel from 6/8 to 7/8. This violates the principle of constant bit-rate.
The key point is worth reemphasizing, so I will do it again: the proper interpretation of each frame of in the CSV file is that N samples were taken during that second. We know nothing about the time of these samples other than the fact they were taken during the frame, and are equally spaced. Pressure altitude could have been sampled at 0.0 or 0.99, and they would both show up exactly the same in the CSV file.
This means we can calculate an error rate, in time, for each data point, due entirely to not knowing where in the frame this particular data-point was recorded. For a data-point sampled at 1Hz, like pressure altitude, that sample could have occurred at any point from 8:19:00 to 8:19:01. This is an error range of 1 second. A similar calculation can be done to be show that the maximum error range is equal to the time period between samples. Samples done at 8Hz have an error range of 0.125s, and 4Hz has 0.25s, and so on.
Also note that the timestamps of the major frames have been processed from the original data (the NTSB FDR report mentions this on page 3). There is no way to know the error in these timestamps, nor do we know the precision. It is a mistake to try to correlate these timestamps with the outside world (like official time of impacts).
The Final 1 to 2 Seconds
Given that the data was compressed, and synched, it’s very likely that any frames that were not complete would be difficult to recover, if even recoverable at all. The implication of this is quite simple, and that is the FDR data in the CSV file “runs out” well before the plane actually hits.
This means that 9:37:44 was the last, complete frame, gathered by the recorder. That puts the likely time of impact in the 9:37:45-6 range, and possibly even into the 9:37:46-7 timeframe. The presence of 9:37:46 in this data suggests that its timestamp may have made it onto the tape. How is that possible if 9:37:45 is not a complete frame? That’s a good question, but a reasonable hypothesis has to do with the storage mechanism used. Solid State Recorders, like all medium, are quite unpredictable if they fail during write operations. The actual area being used to record data can very easily be corrupted if power fails while writing. It’s plausible that the crash caused problems in and around this local area of data, causing corruption of the 9:37:45 data frame (again, changing a single bit in a synch word is enough to cause software to completely choke).
The moral of the story here is that the FDR data runs out anywhere from up to 2 seconds before the plane actually crashed into the Pentagon.
Pitfalls Using the CSV File to Reconstruct the Flight of Flight 77
Please keep in mind the CSV file is not raw FDR data, and it was not meant to be used forensically. As such, much information that is present in the raw FDR is lost. Using the CSV file for this purpose is not what it was intended for, however that does not make it impossible. Any analysis done using the file must successfully justify it’s correctness despite the following errors:
0) Absolute Time Error
The times calculated in the file were processed after the fact by the NTSB analysts. This information can be found in the NTSB FDR report, on page 3. We have no idea what the precision or accuracy of the original time stamps is. Any attempt to correlate FDR times to non-FDR (“real world”) times is flawed from the get-go, given this level of uncertainty. It’s probably safe to assume +/-2 seconds of error in the absolute time. This has no effect on relative time (e.g., if it says two frames are 19 seconds apart, chances are good they are exactly 19 seconds apart, to several decimal places).
False Claim: The FDR couldn’t have recorded 9:44:46, the official time of the crash is 9:44:45!
1) Instrument Error
First, and foremost, if the sensor or instrument is giving the recorder a bad number, it’s obviously not going to record the right one. This type of error must be dealt with on an instrument by instrument basis. Any reconstruction should justify the precision used for each value attained. Please keep in mind that all other errors in this document are due to the recording system, and the lost information in the processed version of its data. The uncertainty caused by the data scheduling into a frame, plus the digital buffering, is in addition to any instrument errors.
False Claim: I already debunked the lagging altimeter nonsense!
(The recording-system errors discussed in this document are in addition to, and independent of, any instrument errors.)
2) Intra-frame Time Error
Since we do not have the frame descriptor, all we know is that N samples are taken during a 1 second period. This means that 1/N of a second is the possible error range for a particular data point. With the frame descriptor, this error would be completely removed if using the raw FDR data.
False Claim: The aircraft’s speed at 09:37:14.00 was 305.5 knots!
3) Digital Buffering Latency
One of the most important purposes of the DAU is to buffer digital outputs from things like the ADC (Air Data Computer). It is a reasonably safe assumption that the Air Data Computer updates the DAU at least once per sampling, and more than likely twice. This means that for a 1-Hz sample, recorded into the data stream, the actual measured time could have been anywhere in the entire previous second. This means, combined, a digital reading in the CSV file, like Computed Airspeed, which comes from the Air Data Computer, has an enormous error range, in the vicinity of 2 seconds, although 1.5 seconds is probably a safe estimate (0.5s for the buffering latency, and 1s for the uncertainty of when the sample was actually recorded). More then likely, the raw data stream has embedded the actual measurement times, so this error might be completely removed using the raw FDR data.
False Claim: The worst case scenario for the 9:14:14 frame’s airspeed is 9:14:14.00, then!
(Yes, that is the worst case time it could be recorded… not measured).
4) Simultaneity Issues
You cannot assume any two samples occurred at the same time. Any analysis that combines two columns of numbers is risking using numbers that did not happen at the same moment in time, for a calculation that assumed they did.
False Claim: The altimeter data shows you’d need positive acceleration to hit the light poles, the accelerometer is showing negative acceleration! (Did you account for the +/- 2 seconds, potentially, between those two separate data points?)
The Bottom Line on Error
Two variables sampled at 1Hz will appear on the same line in the CSV file, however they have a total, combined, error range of nearly 3 (or even 4) seconds. How is that possible? Let me walk you through it.
In this example we consider two samples, A and B, both sampled at 1Hz, during the time frame between second 1.0 and 2.0:
Sample A:
Measured at 0.5s ADC->DAU (buffered)
Recorded at 1.0s DAU->REC (recorded)
Sample B
Measured at 1.98s ADC->DAU
Recorded at 1.99s DAU->REC (recorded)
Please note that sample A and Sample B will both appear on the same line in the CSV file, as they are both part of the frame with timestamp 1.0 second. In this example, Sample A and B are on the same line of the CSV file but were measured 1.5 seconds apart. The entire method could be reversed for A and B, giving a total error between the two of +/- 1.5 seconds, for a full error range of 3 seconds.
Summary
1) The FDR did not record the final moments of Flight 77. There is up to 2 seconds missing.
2) The CSV file is not meant to be analyzed forensically, it is meant to be plotted.
3) The CSV data is not raw FDR data. It is not even serial bitstream data.
4) The CSV data is not meant to be broken down into 1/8th seconds and analyzed.
5) The CSV data, properly interpreted, says that there are N samples during this particular frame.
6) Without the frame description, we do not know when in a frame any one sample occurred.
7) Without the frame description, we have lost the measurement timestamps, so the time a particular word was recorded does not necessarily equate with when it was measured.
8) Given these time-shift errors, any mathematics that uses more than one data-point runs the risk of assuming that two numbers occurred at the same time, when they didn’t.
9) Many of these errors can be corrected, greatly, with the frame descriptor.
10) Any analysis must account for (or justify ignoring) these issues in order to draw any valid conclusions.
Last edited: