|
M Data Extractor (MDE) White Paper
Download this document
White Paper
Executive Summary
The M Data Extractor (MDE) is an intermediary (middle-ware) software component that makes possible robust, high performance connections between the FileMan database residing in an M system and modern, commercial off-the-shelf (COTS) software tools such as relational database systems, decision support systems, and ad hoc report generation systems.
Introduction
FileMan, a component of Vista, is a nationally standardized database with a dictionary-based architecture. It is built on top of the M operating environment and employs a hierarchical data storage methodology.
FileMan is the result of a government-sponsored database development effort over the past 20 years, and is used principally within the Veterans Health Administration. However, FileMan is public-domain software, and as such, has been adopted by non-government healthcare as well as non-healthcare businesses. Thus, FileMan software is capable of running on every major computer operating system.
The Issue: Large Scale Data Analysis on M-based VA Healthcare Systems
FileMan database systems greatly impact a healthcare facility’s effectiveness through its methods for locating, acquiring, and analyzing information. Many healthcare facilities require voluminous data to be gathered and summarized for making informed clinical and business decisions. However, the M-based FileMan database systems do not typically facilitate this type of decision-making. While data is available for the routine healthcare operations, this same data is relatively inaccessible for in-depth analysis and ad hoc queries and reports.
The present FileMan reporting capabilities, while powerful, are not easy to learn. They also consume a significant portion of system resources when complex queries are executed. Plus, the resulting operational data is usually not in the form that users want, since data often contains internal codes, unneeded groupings of data items, and redundant data fields. And, an additional problem with FileMan is that it requires a person highly familiar with the file structure to locate information.
Although many healthcare facilities follow FileMan standards for storing data in data dictionaries, many make minor modifications so that FileMan can meet their unique needs. Although very helpful for facilities on an individual basis, this flexibility can make the analysis of data difficult, whether over single or multiple locations. For example, consider the database definition of a lab test, such as potassium. There may be differences in the way this test is named at various facilities. There can also be multiple forms of these tests at a single site, as in the following example from the Pittsburgh VA:
| Test |
Location of test in File Manager |
| Potassium |
Field 6 in File 63 |
| Potassium (BG) |
Field 645059 in File 63 |
| Potassium (CX7-OUT) |
Field 646074 in File 63 |
The location of these tests would be different at different VA sites. Therefore, it is necessary to use a data extraction tool that works with FileMan and outputs the data in a correlated fashion.
One approach to improving the accessibility of data stored in M databases is the utilization of ODBC mapping directly from within the M system. However, the direct ODBC approach does not provide an efficient solution for large-scale data analysis for the following reasons:
 |
The data remains in its hierarchical form - making performance slow |
 |
There is no aggregation or cleansing of data - two tasks that greatly improve the ease and speed of data analysis |
 |
There is no segmentation of domain-specific data elements - making analysis more complex |
There is great potential for clinical and administrative enlightenment in unlocking the data stored in FileMan, the basis of the Veterans Health Care delivery system. There are 172 VA medical centers and outpatient clinics that span the nation. Since FileMan is standard throughout the entire system, the data contained in the VA database is perhaps the largest unified collection of administrative and clinical data anywhere.
The Solution: the M Data Extractor (MDE)
The M Data Extractor (MDE), combined with readily available, relationally-based software tools, provides a solution to these M data analysis difficulties. The MDE is an intermediary (middle-ware) software component between the FileMan database residing in an M system and modern, commercial off-the-shelf (COTS) software tools, such as relational database systems, decision support systems, and ad hoc report generation systems.
In other words, the MDE performs data migration from FileMan's hierarchical format to a modern, high-performance, relational format, while transforming the hierarchical data relationships within FileMan to equivalent relationally-based linkages. Once this data transformation is performed, third-party tools allow the migrated data to be fully analyzed.
Features and Functions
The MDE includes the following features and functions:
 |
Access to right-out-of-the-box FileMan files, such as: |
| |
 |
Predefined (demographics, medications, admission / discharge) |
| |
 |
Customized (group your own files) |
 |
Extraction of data from M FileMan databases into a relational format that does the following: |
| |
 |
creates tables and keys |
| |
 |
Flattens hierarchically-defined data structures |
| |
 |
Removes transitive data dependencies |
 |
Control of the amount and type of File Manager data to extract, either interactively or via a predefined data file. For example, a researcher could create a list of the CPK lab values from 50 specific patients on his/ her desktop PC |
 |
Conversion of FileMan data types to standard SQL data types |
 |
Ability to extract data from FileMan and load it into SQL in bulk |
 |
Ability to performs small ad hoc data transfers from FileMan to SQL using network (TCP/IP) or serial connections |
 |
Ability to clean and filter data as it is transferred from File Manager to SQL |
 |
Merger of data listed under separate, but essentially the same names, such as lab tests |
 |
Evaluation of arcane data pointers into values, suitable for relational storage |
|