Seasat – Technical Challenges – 5. Classification of Bad Data

In spite of all of the work done to decode and clean data, many errors remained in the supposedly fixed files that had been decoded and multi-pass filtered. As a result, the current count for swath files able to be processed is 1,346 rather than the 1,399 that first came out of prep_raw.sh. Each of these classes of errors are discussed in this section.

Classification of Bad Data: Initially 101 files were discarded. One category, repeats, was simply the result of one tape portion being read twice. Eight duplicate files were discarded.

Classification of Bad Data: Initially 101 files were discarded. One category, repeats, was simply the result of one tape portion being read twice. Eight duplicate files were discarded.

5.1 Short Files

Six files fewer than 10,000 lines long were discovered and removed from the processing set. The number of lines needed to create a 100-km length frame was later determined to be 24,936. Thus, even more files could be removed since they will not create full frames.

5.2 Constant Time

Seven different files from Tape12 all have a constant time of 16777216 throughout. Obviously, this data could not be processed and was discarded. Also, the first five files of tape10 had values of 134217727, but these were discarded during the initial data prep.

5.3 Random Time

Three files were identified with nearly random time values. Again, due to the nature of SAR processing, these files were not used. Those files were fixed_tape16_620Kto677K.017_000.hdr, fixed_tape26.039_000.hdr, and fixed_tape28.002_000.hdr.

Constant Time: When the fix routines were applied to a file with constant times, this is the result. Because the codes are trying to fit the times to a “known” slope when none actually exists in the data, spurious times are introduced.

Constant Time: When the fix routines were applied to a file with constant times, this is the result. Because the codes are trying to fit the times to a “known” slope when none actually exists in the data, spurious times are introduced.

Random Times: Metadata plots of one file from the Random Time error category.

Random TimesMetadata plots of one file from the Random Time error category.

Random Times: Time plot of one file from the Random Time error category showing decoded MSECs vs. range line. This data cannot be recovered.

Random TimesTime plot of one file from the Random Time error category showing decoded MSECs vs. range line. This data cannot be recovered.

5.4 Every Other Zero

Seventeen files were found with some number of headers that contained only zero values. With 1 percent all the way up to 100 percent of the metadata in these files being blank, they were removed from the data set.

5.5 Time Gap

One file was found with a very large time gap that was unable to be fixed. It was removed.

5.6 Time Slope

Based upon slope analysis, thirty-two files were marked as having an incorrect time slope. This was the beginning of the realization that something was wrong with the Seasat time fields. This is discussed in full in section 6, “Tackling the Slope Issues.” This class of errors was revisited and the number was reduced to a mere six swaths that didn’t process correctly.

5.7 Wrong Fix

Initially, 27 files were placed in the wrong fix category based upon visual inspection of the supposedly fixed files. It turns out that these resulted from the time slope assumptions that were built into all of the software. In other words, all of the programs that did linear trending tried to restrict the slopes to be around 0.607. Unfortunately, this was a bad assumption. This class of errors was revisited and the number was reduced to a mere five swaths that didn’t process correctly — see section 6, “Tackling the Slope Issues,” for more details.

Wrong Fix Examples

Wrong Fix: Erratic times result from trying to restrict the data to a specific slope.

Wrong Fix or Bad Slope? This shows a bad fix resulting from trying to fit a slope that is incorrect for the actual raw data. In this case, the data time slope is lower than that being enforced by the programs — thus the incorrect red times.

Wrong Fix or Bad Slope? Subtle wrong fix with bit errors adding to the confusion.

Aside: Wrong Fix or Bad Slope – More About Seasat Times

The pulse repetition frequency (PRF) is the frequency with which pulses are sent from the satellite. For Seasat, the PRF is 1647 Hz. This means that 1,647 lines are transmitted and received per second. Inversely, this means that each line should be sent at equal intervals of 1/1647 = 0.00060716 seconds.

Thus, in the decoded header, which has the MSEC time value in column 6, we expect to see the time change by .6 msec per line. Of course, the value is in integer milliseconds so in reality 0.6 is too precise for our counter to capture. So, what we really expect to see is the counter increasing by 3 every 5 lines. Something like the following example is good data:

From TAPE3_01Kto455K_000)

14 124195 5 8 194 45440300 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
15 133045 5 8 194 45440301 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
16 142042 5 8 194 45440301 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
17 150892 5 8 194 45440302 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
18 159890 5 8 194 45440302 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
19 168740 5 8 194 45440303 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
20 177737 5 8 194 45440304 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
21 186587 5 8 194 45440304 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
22 195585 5 8 194 45440305 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
23 204435 5 8 194 45440306 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
24 213432 5 8 194 45440306 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
25 222282 5 8 194 45440307 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
26 231280 5 8 194 45440307 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
27 240130 5 8 194 45440308 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
28 249127 5 8 194 45440309 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
29 257977 5 8 194 45440309 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
30 266827 5 8 194 45440310 2716 0 5 1 4 22 1 1 0 0 0 0 1 0
31 275825 5 8 194 45440310 2716 0 5 1 4 22 1 1 0 0 0 0 1 0

Here, we see the time values go from 45440300 to 45440310 over the course of 16 lines. This gives a line time of 10 msec / 16 lines, or 0.625 seconds/line — definitely in the correct range. This is not always the case, however, even with time filtering. In TAPE10_01Kto364K, the first five files all have times of 134217727, an impossible value. Almost all of TAPE10_01Kto364K_006 has zero values for the time. While on TAPE10_01Kto364K_007, the times are 132819626, another impossible value.

From TAPE10_01Kto364K(_000-_005)

20 171395 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
21 180245 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
22 189095 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
23 198092 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
24 206942 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
25 215940 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
26 224790 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
27 233787 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
28 242637 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0
29 251635 9 8 511 134217727 4095 0 5 1 4 18 0 1 0 0 0 0 1 0

From TAPE10_01Kto364K_006:

50 417425 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
51 426275 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
52 435272 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
53 444122 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
54 453120 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
55 461970 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
56 470967 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
57 479817 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
58 488815 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
59 497665 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0
60 506662 9 8 0 0 4095 0 5 1 4 20 0 1 0 0 0 0 1 0

From TAPE10_01Kto364K_007

50 455037 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
51 463887 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
52 472885 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
53 481735 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
54 490732 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
55 499582 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
56 508580 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
57 517430 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
58 526427 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0
59 535277 9 8 511 132819626 4095 0 1 1 4 8 0 1 0 0 0 0 1 0
60 544275 9 8 511 132819626 4095 0 5 1 4 8 0 1 0 0 0 0 1 0

From TAPE4_01Kto688K_012

18 50583644 6 8 202 13851551 2338 0 5 1 4 9 1 1 0 0 0 0 1 0
19 50592494 6 8 202 13851551 2338 0 5 1 4 9 1 1 0 0 0 0 1 0
20 50601491 6 8 202 13851552 2338 0 5 1 4 9 1 1 0 0 0 0 1 0
21 50609604 6 8 202 13851553 2338 0 5 1 4 9 1 1 0 0 0 0 1 0
22 50618454 6 8 202 13851553 2338 0 5 1 4 9 1 1 0 0 0 0 1 0
. .
118 51303591 6 8 202 13851599 2338 0 5 1 4 9 0 1 0 1 0 0 0 0
119 51304329 6 8 202 13851599 2338 0 5 1 4 9 1 1 0 0 0 0 1 0
120 51313179 6 8 202 13851601 2338 0 5 1 4 9 0 0 0 0 0 0 0 0
121 51321291 6 8 202 13851601 2338 0 5 1 4 9 1 1 0 0 0 0 1 0
122 51330289 6 8 202 13851601 2338 0 5 1 4 9 1 1 0 0 0 0 1 0

In this last example from Tape4, we see that the time at line 20 is 13851552. The time at line 120 is 13851601. Thus, the time changed by 49 in 100 lines. This gives a line time slope of .49, considerably less than the 0.607 msec expected. In fact, this pattern continues through the file, with all of the times showing an incorrect time slope around 0.486 msec. Initially, it was assumed that this data could not be recovered. However, it was later determined that a lot of the Seasat data thought to have bad time slopes, in fact, simply had an unknown timing delay recorded with the satellite clock times. See “Slope Issues” for details.

Written by Tom Logan, July 2013