Click Here to Login

NPPESData.com

Clean, easy, reliable NPI Data

The below present the various format and structure options that NPPESData.com provides.

Other Information

Flat Structure

All of our flat structures are simple to implement and easy to load. Flat formats are best for those who need a quick load of the NPPES data for prototyping or other similar purposes.

Pros: Quick and easy
Cons: Software development against a flat structure can be more complex and difficult to maintain.
Updating: Use the DML provided by us, replace the whole file each month or write your own software to process delta files (CSV or Pipe-Delimited).

Relational Structure

Relational structures are more difficult to implement (a larger number of database tables and files to load), but provide more flexibility down the line. They are easier to develop software against, and are the easiest to use with our update service.

To create our relational structures, we take all low-cardinality values (those that have a small number of common values) and put them into separate tables with unique numeric identifiers. We also take multi-instance data (e.g. provider credentials and provider taxonomy) and put them in separate tables related by the NPI (the unique provider identifier) for a given provider, including a sequence of 1st, 2nd,3rd, etc. if more than one exists.

Pros: More flexible, faster to update, easier to develop software.
Cons: Initial implementation more complex.
Updating: Run incremental ANSI-SQL scripts against your database or write your own software to process delta files (CSV or Pipe-Delimited).

CSV – Comma Separated Values

This is a simple, standard format that can be read natively by many applications (such as Microsoft Excel). However, the NPPES data set is so large that you may have problems doing so.

The format itself provides optionally quoted values delimited by commas with each line (row) being delimited by a newline character. We ensure that commas within values are properly quoted and ensure that double-quotes are not included in any data.

Example:

ID,Last_Name,First_Name,Occupation
123,”Doe”,”Jane”,”Certified Public Accountant”
124,”Smith”,”John”,”Programmer Analyst”

Pros: Standard, widely-used format.
Cons: File too large for MS Excel and other apps
Updating: We provide added,updated and removed CSV files that you can process.

Pipe-Delimited

Pipe-Delimited is similar to CSV except that data field values are delimited by a pipe “|” character. This has the advantage of being less ambiguous since pipe characters are unlikely to occur in user-entered data (we convert any pipe characters that do exist to forward slash characters “/”).

Example:

ID|Last_Name|First_Name|Occupation
123|Doe|Jane|Certified Public Accountant
124|Smith|John|Programmer Analyst

Pros: Generally Easier to process than CSV (using either your own custom software or database loading programs).
Cons: Not a widely-used standard.
Updating: We provide added, updated and removed files that you can process.

MySQL Table

A MySQL Table is probably the fastest and easiest way to “load” data since MySQL is a free, file-based database. All MySQL Table options specify the MyISAM database engine. To load, simply shut down MySQL and copy the appropriate database files into the correct data folders and then re-start MySQL.

If you need to use InnoDB or other database engines with MySQL, you can use the ANSI DML option - or use the MySQL MyISAM tables to stage data and then insert into an InnoDB table with a few simple scripts.

You can also use MySQL as a staging database and then transfer the data to an Oracle, Sybase, SQL Server or other database using many commercially available ETL (Extract, Transform and Load) tools.

Pros: Fast, free database software that is easy to administer; Can be used as a staging database to load into other commercial databases.
Cons: You might not be using MySQL.
Updating: 1) Replace the entire database with the current full dataset
2) Run ANSI SQL updates against the database

ANSI SQL - DML

DML stands for “Data Manipulation Language” and is part of the ANSI SQL (Structured Query Language) specification. Basically, the DML format provides a set of files that contain insert statements into a table or set of tables.

The DML we provide is compatible with Oracle, SQL Server, DB2, MySQL and other database systems that are ANSI compliant.

Pros: Highly standardized means of loading data into a database.
Cons: Much slower to load; Much larger files.
Updating: We provide a set of SQL statements that insert new records and process updates/deletes.