NPPES File Details
| As of: | January, 2012 |
| Record Count: | 3.50 Million Records |
| File Size: | 4.2 GB (419 MB Compressed) |
| # Columns (Flat Structure): | 329 |
| Available Formats (initial): | CSV, Pipe-Delimited, MySQL Table, ANSI SQL Inserts |
| Available Formats (updates): | CSV, Pipe-Delimited, ANSI SQL DML |
Changes We Make
We make several changes to the structure of the raw data as provided by CMS. Some of the key changes are noted below.
- Column names are abbreviated
For example, "Provider Business Mailing Address Country Code" as provided in the NPPES data is abbreviated in all of our formats to "bus_mail_country_code".
Abbreviations make the data much easier to work with while retaining readability.
- Column names are lower case
In our DDL and DML scripts, we force all column names to be lower case (as opposed to the mixed case provided by NPPES)
- All date formats are converted to YYYY-MM-DD format for simplicity.
If the native date format of your RDBMS is different, you may be able to temporarily change it for our loads. If not, we can provide sample scripts (in Perl) that allow you to quickly pre-process the data we provide to reformat dates.
- Line-endings are a forced to single newline character.
We do this because a single newline is still functional on Windows-based systems, however a newline/carriage return (Windows line ending) presents difficulties on Unix-based systems (and others).
- Additional Columns
We provide additional columns in the data to indicate that a record has been removed. If you run our update DML against an installed database, it will not delete rows, but rather mark them as removed. This allows you the flexibility to maintain provider history if you desire (in case other data in your database is keyed to NPPES data), or to just simply delete the marked rows once you've run the updates.
- Code Tables
We have converted all integer-keyed code tables to varchar (and their corresponding columns in the data tables). This is because CMS has columns with both integer and varchar data types keyed to the same code set.
We have added values to each of the CMS code sets 'X' = 'Unknown Value' for data that falls outside of CMS' specification. Null values provided in the data remain null.
- Relational Data Model
The below diagram displays the tables created in the relational model.