At this stage in the development of the ASF Seasat Processing System (ASPS): 1,346 cleaned raw signal swaths were created; ROI was modified to handle Seasat offset video format; New state vectors were selected for use over two-line elements (TLE’s); Caltones were filtered from the range power spectra; Data window position files were created…
In order to create a synthetic aperture for a radar system, one must combine many returns over time. For Seasat, a typical azimuth reference function — the number of returns combined into a single focused range line — is 5,600 samples. Each of these samples is actually a range line of radar echoes from the ground. Properly combining all of these lines requires knowing precisely when a range line was received by the satellite.
In practice, SAR systems transmit pulses of energy equally spaced in time. This time is set by the pulse repetition frequency (PRF); for Seasat, 1,647 pulses are transmitted every second. Alternatively, it can be said a pulse is transmitted every 0.607165 milliseconds, an interval commonly referred to as the pulse repetition interval or PRI. Without this constant time between pulses, the SAR algorithm would break down and data would not be focused to imagery.
Many errors existed in the Seasat raw data decoded at ASF. As a result, multiple levels of filtering were required to deal with issues present in the raw telemetry data, particularly with time values. Only after this filtering were the raw SAR data processible to images.
4.1 Median Filtering and Linear Regression
The first attempt at cleaning the data involved a simple one-pass filter of the pertinent metadata parameters. The following seven parameters were median filtered to pull out the most commonly occurring value: station code, day of year, clock drift, delay to digitization, least significant digit of year, bits per sample and PRF rate code. A linear regression was used to clean the MSEC of Day metadata field. This logic is encapsulated in the program fix_headers, which is discussed in more detail in “Final Form of Fix_Headers.”
Implementing the median filter was straightforward:
- Read through the header file and maintain histograms of the relevant parameters.
- Use the local median value to replace the decoded value and create a cleaned header file. Here “local” refers to the 400 values preceding the value to be replaced.
This scheme works well for cleaning the constant and rarely changing fields. It also seems to work quite well for the smoothly changing clock drift field. At the end of this section are the examples from “Problems with Bit Fields,” along with the corresponding median-filtered versions of the same metadata parameters. In each case, the median filter created clean usable metadata files.
Performing the linear regression on the MSEC of Day field was also straightforward. Unfortunately, the results were far from expected or optimal. The sheer volume of bit errors combined with discontinuities and timing dropouts made the line slopes and offsets highly variable inside a single swath. These issues will be examined in the next section.
Table of Filtered Parameters
|station_code||Median||Constant per datatake|
Data Cleaning Examples
Day of Year
Last Digit of the Year
Bits Per Sample
PRF Rate Code
Delay to Digitzation
4.2 Time Cleaning
Creating a SAR image requires combining many radar returns over time. This requires that very accurate times are known for every SAR sample recorded. In the decoded Seasat data, the sheer volume of bit errors, combined with discontinuities and timing dropouts, resulted in highly variable times inside a single swath.
4.2.1 Restricting the Time Slope
One of the big problems with applying a simple linear regression to the Seasat timing information was that the local slope often changed drastically from one section of a file to another based upon bit errors, stair steps and discontinuities. Since a pulse is transmitted every 0.607165 milliseconds, it seemed that the easiest way to clean all of the MSEC times would be to simply find the fixed offset for a given swath file and then apply the known time slope to generate new time values for a cleaned header file.
This restricted slope regression was implemented when it became obvious that a simple linear regression was failing. By restricting the time slope of a file to be near the 0.607-msec/line known value, it was assumed that timing issues other than discontinuities could be removed. The discontinuities would still have to be found and fixed separately in order for the SAR focusing algorithm to work properly. Otherwise, the precise time of each line would not be known.
4.2.2 Removing Bit Errors from Times
- Crude time filtering, trying to fix all values that are > 513 from local linear trend:
- Bit fixes – replace values powers of 2 from trend
- Fill fixes – fill gaps in constant consecutive values
- Linear fixes – replace values with linear trend
- Reads and writes a file of headers
Even with linear regressions and time slope limitations, times still were not being brought into reasonable ranges. Too many values were in error in some files, and a suitable linear trend could not be obtained. So, another layer of time cleaning was added as a pre-filter to the final linear regression done in fix_headers. The program fix_time was initially created just to look for bit errors, but was later expanded to incorporate each of three different filters at the gross level (i.e. only values > 513 from a local linear trend are changed):
- If the value is an exact power of 2 off from the local linear trend, then add that power of 2 into the value. This fix attempts to first change values that are wrong simply because of bit errors. The idea is that this is a common known error type and should be assumed as the first cause.
- Else if the value is between two values that are the same, make it the same as its neighbors. This fix takes advantage of the stair steps found in the timing fields. It was only added in conjunction with the fix_stairs program discussed below. The idea is to take advantage of the fact that the stair steps are easily corrected using the known satellite PRI.
- Else just replace the value with the local linear trend. At this point, it is better to bring the points close to the line than to leave them with very large errors.
4.2.3 Removing Stair Steps from Times
- Fix for sticky time field – Turns “stairs” into “lines” by replacing repeated time values with linear approximation for better linear trend
- Reads and writes a file of headers
For yet another pre-filter, it was determined that the stair steps time anomaly should be removed before fitting points to a final linear trend. This task is relatively straightforward: If several time values in a row are the same, replace them with values that fit the known time slope of the satellite. The program fix_stairs was developed to deal with this problem.
Extreme Stair Step: raw0.headers (blue) are the original unfiltered MSEC time values; newest.headers (red) shows the result of fixing the “stairs” that result from the sticky clock, presumably an artifact of the Seasat hardware, not the result of bit rot, transcription or other errors
4.2.4 Final Form of fix_headers
- Miscellaneous header cleaning using median filters:
- Station Code
- Least Significant Digit of Year
- Day of Year
- Clock Drift
- Bits Per Sample
- PRF Rate Code
- Delay to Digitization
- Time Discontinuity Location and Additional Filtering
- Replace all values > 2 from linear trend with linear trend
- Locates discontinuities in time, making an annotated file for later use.
- 5 bad values with same offset from trend identity a discontinuity
- +1 discontinuity is forward in time and can be fixed
- if > 4000, too large – cannot be fixed
- otherwise, slide time to fit discontinuity
- if offset > 5 time values, save this discontinuity in a file
- -1 discontinuity is backwards in time and cannot be fixed
- Reads and writes a file of headers
Although it started out as the main cleaning program, fix_headers is currently the final link in the cleaning process. Metadata going through fix_headers has already been partially fixed by reducing bit errors, removing stair steps, and bringing all other values that show very large offsets into a rough linear fit. So in addition to performing median filtering on important metadata fields (see section 4.1), this program performs the final linear fit on the time data.
Initially, a regression is performed on the first window of 400 points and used to fix the first 200 time values of that window. Any values that are more than 5 msec from their predecessors are replaced by the linear fit. After this, a new fit is calculated every window/2 samples, but never within 100 samples of an actual discontinuity. The final task for fix_headers was to locate the rough locations of all real discontinuities that occur in the files. At least, it was designed to only find real discontinuities – those being the final problem hindering the placement of reasonable linear times in the swath files.
Identifying discontinuities was challenging. Much trial and error resulted in a code that worked for nearly all cases and was able to be configured to work for the other cases as well. The basic idea is that if a gap is found in the data, and if after the gap no other gaps occur within 5 values, then it is possible that a discontinuity exists. If so, the program determines the number of lines that would have to be missing to create such a gap and records the location and size in an external discontinuity file. Note that only forward discontinuities can be fixed in this manner and only discontinuities less than a certain size. In practice, the procedure attempts to locate gaps of up to 4,000 lines, discarding any datasets that show gaps larger than this.
Extreme Discontinuity: This decoded signal data shows a 5.3-hour gap in time. This cannot be fixed.
In the end, all of the gaps in the data were identified, and, there is high confidence that any such discontinuities found are real and not just the result of bit errors or other problems. Unfortunately, this method was not able to pinpoint the start of problems, only that they existed, as shown in the following set of graphs.
Decoded Signal Data with No Y-range Clipping
Decoded Signal Data with Y-range Clipping
Linear Trend of Decoded Signal Data
Comparison of Decoded Signal Data and Linear Fit
False Discontinuity #1
False Discontinuity #2
4.2.5 Removing Discontinuities
- fix all time discontinuities in the raw swath files
- for each entry in previously generated discontinuity file:
- search backwards from discontinuity looking for points that don’t fit new trend line
- when 20 consecutive values that are ont within 1.5 of new line are found, you have found start of discontinuity
- For length of discontinuity
- Repeat header line in .hdr file (fixing the time only)
- Fill .dat file with random values
- Reads discontinuity file, original .dat and .hdr file, and cleaned .hdr file. Creates final cleaned .dat and .hdr file ready for processing
- for each entry in previously generated discontinuity file:
The first task in removing the discontinuities is locating them. The rough area of each real discontinuity can be found using the fix_headers code as described in the previous section.
Finding the exact start and length of each discontinuity still remains to be done. This search and the act of filling each gap thus discovered is performed by the program dis_search. The discontinuity search is performed backward, with 3000 lines after each discontinuity area being cached and then searched for a jump down in the time value to the previous line. These locations were marked as the actual start of the discontinuity. The gap in the raw data between the time before the discontinuity and the time after must then be filled in. Random values were used for fill, these being the best way to not impact the usefulness of the real SAR data.
Discontinuity Fills: Each plot shows range line number versus MSEC metadata value. Original decoded metadata (left) is spotty and contains an obvious time discontinuity. After the discontinuity is found and corrected, linear time is restored (right).
During decoding and cleaning, it was assumed that the time slope of the files would be roughly guided by the Pulse Repetition Interval (PRI) of the satellite, i.e. a Pulse Repetition Frequency (PRF) of 1647 Hz means that 1,647 lines are being transmitted and received per second. This means that the PRI is 0.00060716 msec. Based upon this, then, each 1,000 lines of Seasat data should be equivalent to .60716 seconds.
Alternately, in milliseconds, the time slope for these files should always be 0.60716. It was discovered that this is not the case with much of the actual data, as shown in the following table and graphs:
|Line||Time||Time Diff||Calculated Slope|
Seasat Times: PRF = 1647 Hz, so PRI is 0.0006071645 msec. In MSEC, the time slope should always be 0.6071645. Yet, for this datatake, the time slope is consistently only 0.486!
Original Data: This example shows a dataset that is relatively clean before any filtering is applied. It seems that this file should have been extremely easy to clean.
Filtered Data: After the dataset went through the prep_raw.sh procedure, this was the resulting time plot. It is, quite obviously, very wrong.
Comparison of Original with Filtered: Although the times look fine in the first (unfiltered) plot, they are wrong for this satellite based upon the known PRI. The ASF cleaning software tried to fix these wrong time values using a known slope of 0.607. This introduced a discontinuity into the data and resulted in incorrect times.
These results, wherein the time slope of the raw data does not match the known PRI of the satellite, were incredibly perplexing. At first, it was assumed that these data could not be processed reliably and were simply categorized into the large time-slope error and wrong-fix error categories.
Analysis of the time slopes in the original unfiltered data only pointed out how extreme the problem really was. Well over 100 files showed slopes that were either less than 0.606 or more than 0.608, with the lowest in the 0.48 range. The highest reliable estimate showed a slope of well over 0.62.
6.1 Slope Issues Explained
Eventually, through conversation with original Seasat engineers at the Jet Propulsion Lab (JPL), it was discovered that the Seasat metadata field MSEC of Day actually contains not only the time of imaging but also the time to transmit data from the spacecraft to the ground station. This adds a variable time offset to the metadata field. Once this was understood, it was readily obvious that using the known PRF as a guide for filtering was an incorrect solution.
Thus, the entire cleaning process was revisited, with all of the codes allowing more relaxed slope values during linear regression. This worked considerably better than the previous cleaning attempt. However, it did not solve the problems entirely.
6.2 Final Results of Data Cleaning
The final set of cleaned Seasat raw swaths was assembled using three main passes through the archives with different search parameters, along with a few files that were fixed on a case-by-case basis. Basically, the final version of the code was run and the results examined for remaining time gaps. Any files with large or many time gaps were reprocessed using different parameters. In the end, 1,346 swaths were cleaned, 2 by hand, 14 from the first pass, 25 from the second pass, and the remainder in the final cleaning pass. These then are the final cleaned Seasat archives for the initial release of ASF’s Seasat products.
|Date||Total Datasets||Dataset with Time Gaps||Largest Time Gap||Largest number of gaps in a file||Files with >10 msec gap|
Final Cleaned Seasat Swaths: Approximately one year after the project started, 1,346 raw Seasat swaths were cleaned and ready to be processed into SAR image products.
Written by Tom Logan, July 2013
With the Seasat archives decoded into range line format along with an auxiliary header file full of metadata, the next step is to focus the data into synthetic aperture radar (SAR) imagery. Focusing is the transformation of raw signal data into a spatial image. Unfortunately, pervasive bit errors, data drop outs, partial lines, discontinuities and many other irregularities were still present in the decoded data.
3.1 Important Metadata Fields
In order for the decoded SAR data to be focused properly, the satellite position at the time of data collection must be known. The position and velocity of the satellite are derived from the timestamp in each decoded data segment, making it imperative that the timestamps are correct in each of the decoded data frames.
Slant range is the line of sight distance from the satellite to the ground. This distance must be known for focusing reasons and for geolocation purposes. As the satellite distance from the ground changes during an orbit, the change is quantified using the delay-to-digitization field. During focusing, the slant range to the first pixel is calculated using these quantified values. More specifically, the slant range to the first pixel (srf) is determined using the delay to digitization (delay), the pulse repetition frequency (PRF) and the speed of light (c):
It turns out that the clock drift is also an important metadata field. Clock drift records the timing error of the spacecraft clock. Although it is not known how this field was originally created, upon adding this offset to the day of year and millisecond of day more accurate geolocations were obtained in the focused Seasat products.
Finally, although not vital to the processing of images, the station code provides information about the where the data was collected and may be useful for future analysis of the removal of systematic errors.
3.2 Bit Errors
It is assumed that the vast majority of the problems in the original data are due to bit errors resulting from the long dormancy of the raw data on magnetic tapes. The plots in section 2.1 showed typical examples of the extreme problems introduced by these errors, as do the following time plots.
3.3 Systematic Errors in Timing
Beyond the bit errors, other, more systematic errors affect the Seasat timing fields. These include box patterns, stair steps and data dropouts.
To top off the problems with the time fields, discontinuities occur on a regular basis in these files. Some files have none; some have hundreds. Some discontinuities are small — only a few lines. Other discontinuities are very large — hundreds to thousands of lines. Focusing these data required identifying and dealing with discontinuities.
Forward Time Discontinuity
Backward Time Discontinuity
Aside: Initial Data Quality Assessment
Of the 1,470 original decoded data swaths
- Datasets with Time Gaps (>5 msec): 728
- Largest Time Gap: 54260282
- Largest Number of Gaps in a Single File:1,820
- Number of files with stair steps: 295
- Largest percentage of valid repeated times: 63%
- Number of files with more than one partial line: 1,170
- Largest percentage of partial lines: 42%
- Number of files with bad frame numbers: 1,470
- Largest percentage of bad frame numbers: 17%
After the decoding, cleaning and focusing of the Seasat SAR data, many artifacts still exist in the initial ASF Seasat SAR products….
In spite of all of the work done to decode and clean data, many errors remained in the supposedly fixed files that had been decoded and multi-pass filtered….
Soil Moisture Passive Active (SMAP) is a remote-sensing observatory with two instruments — a synthetic aperture radar (SAR) and a radiometer — that map soil moisture and determine the freeze or thaw state of the area being mapped….
“A rare characteristic of the SMAP Project is its emphasis on serving both basic Earth System science as well as applications in operational and practice-oriented communities.”
Contents of Full Handbook
1. Introduction and Background
2. Mission Overview
3. Instrument Design and L1 Data Products
4. Soil Moisture Data Products
5. Value-Added L4_SM Soil Moisture Product
6. Carbon Cycle Data Products
7. Science Data Calibration and Validation
8. NASA SMAP Applications Program
9. SMAP Project Bibliography
SMAP Handbook Excerpts on ASF's Roles
The Alaska Satellite Facility (ASF) is one of four ground stations that support the SMAP mission and one of two NASA DAACs that distribute SMAP data. The SMAP baseline science data products will be generated within the project’s Science Data System and made available publicly through the two NASA-designated Earth-science data centers. The ASF Synthetic Aperture Radar (SAR) Distributed Active Archive Center (DAAC) will provide Level 1 radar products, and the National Snow and Ice Data Center (NSIDC) DAAC will provide all other products. The excerpts below from the SMAP Handbook focus on ASF’s roles.
ASF, part of the University of Alaska Fairbanks (UAF) Geophysical Institute, operates the SAR DAAC for NASA. For more than 20 years, ASF has worked in conjunction with the SAR research community and scientists across the globe providing near-real-time and archive data from several key Earth-observing satellites. In support of this user community, ASF offers interactive web resources for data search and download, and creates custom software tools for data interpretation and analysis.
ASF’s DAAC is one of 12 Data Centers supported by NASA and specializes in the processing, archiving, and distribution of SAR data to the global research community. In recent years, the ASF DAAC has moved from a process-on-demand to a download-on-demand data system that provides direct access to over 1 PB of SAR data. The ASF data system, comparable to the EOSDIS Core System, provides ingest, cataloging, archiving, and distribution of ASF DAAC’s complete data holdings. ASF distributes focused and unfocused SAR data products, browse images, and relevant metadata in multiple formats through the Vertex data search portal.
Ground Data System (GDS)
The primary path for commanding the SMAP observatory and returning science and engineering data is through three northern-hemisphere tracking stations and one southern-hemisphere station in Antarctica. Data return at the northern-hemisphere stations is via 11.3-m antennas located at Wallops, Virginia (WGS), Fairbanks, Alaska (ASF), and Svalbard Island, Norway (SGS). Data return at the southern-hemisphere station is via the 10-m antenna at McMurdo Station, Antarctica (MGS). The table below gives characteristics of the four stations and average contact statistics from the science orbit. Because SMAP is in a near-polar orbit, the higher latitude stations have more frequent contact opportunities.
|Ground Station||Antenna||Latitude||Average # of Contacts per day*||Average Coverage Minutes/day*|
|Svalbard (SGS) Norway||11.3 m||78.2ºN||10.3||88.3|
|Fairbanks (ASF) Alaska||11.3 m||64.9ºN||6.8||53.7|
|Wallops (WGS) Virginia||11.3 m||37.9ºN||3.3||25.8|
|McMurdo (MGS) Antarctica||10.0 m||77.8ºS||10.4||90.7|
ASF DAAC Support of NASA Missions
The ASF DAAC provides support for NASA and NASA-partner missions assigned to it by the Earth Science Data and Information System (ESDIS) Project. The ASF DAAC has extensive experience managing diverse airborne and spaceborne mission data, working with various file formats, and assisting user communities to further the use of SAR data.
These efforts are facilitated, in part, by ASF Scientists and Data Managers, who interact with mission teams, provide subject matter expertise, inform data and metadata formats, evaluate data structure and quality, and address data support needs. A key project component at ASF is the core product team, which provides integration of new datasets into the ASF data system and ensures efficient coordination and support of each mission. The team members have mission-specific expertise and consist of the following personnel:
- The Project Manager is the team leader who oversees mission activities at ASF and coordinates with external groups.
- The Product Owner is a primary product stakeholder and oversees ingest, archive, documentation, and distribution of data products as well as managing interactions with mission and ASF scientists and other stakeholders.
- The User Services Representative ([email protected]) supports data users with products and software tools and communicates user feedback or suggestions for improvement to the Project Manager and Product Owner.
- Software Engineers design, develop, and maintain software for the acquisition, processing, archiving, and distribution of satellite and aerial remote sensing data.
- Software Quality Assurance Technicians provide software and web-based-application testing prior to delivery to the production data system to ensure integrity, quality, and overall proper functionality through testing methods to uncover program defects, which in turn are reported to software engineers.
- The Technical Science Writer composes and edits a variety of ASF materials, from newsletter articles to technical documentation.
The core product team’s responsibilities for data management include:
- Ingesting, cataloging, archiving, and distributing data
- Providing guidance on file formats and integration of new file formats into the ASF data system
- Describing data products and producing user manuals and guide documents
- Creating metadata and exporting it to CMR and GCMD (Global Change Master Directory)
- Ensuring accurate metrics are reported to EMS (ESDIS Metrics System)
- Designing, developing, and deploying specialized data portals that allow online access to data products and information
- Creating software tools for data interpretation and analysis
- Assisting users with the selection and usage of data
ASF also supports NASA and partner missions through the operation of a ground station with two 11-m antennas, providing complete services, including data downlinking, commanding, and range/Doppler tracking. ASF is part of the NASA Near Earth Network (NEN) supporting a variety of low-Earth-orbit spacecraft.
ASF DAAC Data Systems
The ASF DAAC operates a custom data system designed, implemented, and supported by DAAC personnel. During its evolution, the ASF data system has moved from using primarily custom software on capital equipment to commodity hardware and commercial off-the-shelf (COTS) software and hardware solutions. This has greatly lowered development and maintenance costs for the data system, while simultaneously providing a higher level of performance. The ASF DAAC data system provides the following capabilities:
- Automated data ingest occurs from the ASF ground station as well as external data providers in a variety of media and formats.
- Ingested data are pre-processed when necessary, providing browse or derivative products.
- The central ASF data system archive is provided by a Data Direct Networks gridscaler storage system.
- This system provides direct access to over 1 PB of processed data as well as the capability for automated backups to an offsite location.
- Raw data are held in a robotic silo for access by the processing system. ASF maintains a backup in an external location in case of silo failure.
- ASF provides direct http access to DAAC data products and utilizes NASA’s User Registration System (URS) for user authentication.
- NASA data are provided to public users with no restrictions. Partner data are provided to NASA-approved users through URS for authentication and ASF’s internal database for access control.
- The data system provides web-based access to the archive through Vertex. Vertex supports the data pool with direct download of processed data.
- Through custom portals and applications, the DAAC provides additional services such as mosaic subsetting, mosaicking, and time-series analysis.
- ASF DAAC exports relevant metadata to NASA’s ECHO system.
- ASF DAAC exports ingest, archive, and download metrics to NASA’s EMS system.
- ASF DAAC assists users with data discovery and usage, maintains product documentation and use guides, and supports feedback between the ASF user community and the core product teams.
SMAP at ASF DAAC
ASF provides a variety of services, software tools, and user support to address the needs of the SMAP user community. The ASF core project team will leverage on-going collaborations with the SMAP Project to identify and prioritize SMAP user community needs, which in turn will inform development and implementation of data support and value-adding services for the mission. The SMAP website at ASF will serve as an interactive data portal, providing users with relevant documentation, custom tools and services, and ancillary data and resources.
Post-Launch SMAP Data
ASF will ingest, distribute, archive, and support postlaunch Level 1 radar products for the SMAP mission. ASF will receive the Level 1 radar products from the SMAP Science Data System at the Jet Propulsion Laboratory (JPL) in Pasadena, California.
Non-SMAP Data of Interest to SMAP
ASF will cross-link from the SMAP website to data collections that complement SMAP data and are of interest to the user community. Some of these collections are distributed by ASF, including the following:
- Airborne Microwave Observatory of Subcanopy and Subsurface (AirMOSS) data products
- Jet Propulsion Laboratory Uninhabited Aerial Vehicle SAR (UAVSAR) data products
- Making Earth System Data Records for Use in Research Environments Inundated Wetlands (MEaSUREs) data products
- Advanced Land Observing Satellite-Phased Array L-band SAR (ALOS PALSAR)
- Japanese Earth Resources Satellite-1 (JERS-1) image data and mosaics
The Seasat satellite was designed to cover areas up to 75° north latitude….
The documents listed here discuss the requirements to obtain spaceborne snapshots of the Polar Regions and key high latitude processes….