Monday, July 25, 2016

How To Import Text Files Into Sql Server Tables

Occasionally you need to import data from a fixed format file.  "BULK INSERT" provides an easy way to do that.

Suppose you have a three line column-formatted file of pet information as shown below
(the first two rows are for column description and do not appear in the file):


0        1         2         3         4         5
12345678901234567890123456789012345678901234567890
CC  1Dog     Rover                United States   
CC 82Turtle  Yurtle               Germany         
CC  5Llama   Rama                 Peru          
  



And you want to put them in this table


CREATE TABLE MYPETS
(
 [Age]     int   NOT NULL,
 [Animal] CHAR(10)   NOT NULL,
 [Name] VARCHAR(30) NOT NULL
)
GO


You first need to create an xml description of the data. 

In the "RECORD" you specify which columns hold which data fields.  The "ROW" will describe how to stuff the fields into the table.

A few notes:

You don't have to import all the fields.  You can skip them by not putting them in the "ROW" section.

You do need to describe the entire row.  It's a little tricky for the length of the RECORD because of "end-of-line" characters.  Are there 1 or 2 terminating characters?
If you get errors like this:



Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 (Name).


You may need to experiment with the length of the final column.
When parsing integers I found that its easiest to convert all the "INT" columns to "CHAR" and look at the output.  Often its off by one or two characters.  Then convert back to INT.

Here's our xml format file:


<BCPFORMAT  
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"  
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
  <RECORD> 
    <FIELD ID="1" xsi:type="CharFixed" LENGTH="2"/>  
    <FIELD ID="2" xsi:type="CharFixed" LENGTH="3"/>  
    <FIELD ID="3" xsi:type="CharFixed" LENGTH="8"/>  
    <FIELD ID="4" xsi:type="CharFixed" LENGTH="20"/>  
    <FIELD ID="5" xsi:type="CharFixed" LENGTH="20"/>  
  </RECORD> 
  <ROW> 
    <COLUMN SOURCE="2" NAME="AGE" xsi:type="SQLINT"/> 
    <COLUMN SOURCE="3" NAME="ANIMAL" xsi:type="SQLCHAR"/> 
    <COLUMN SOURCE="4" NAME="NAME" xsi:type="SQLCHAR"/> 
  </ROW> 
</BCPFORMAT> 



Finally to put it all together, use the "BULK INSERT" command to read the document and use your xml format file:



BULK INSERT MYPETS FROM 'c:\Users\BDCOOPER\Documents\mypets.txt'
WITH
(
FIRSTROW = 1, -- skip to this row increase of header rows
--LASTROW = 12, --optional used for debugging
ROWTERMINATOR = '\n',
--ERRORFILE = 'c:\Users\
BDCOOPER\Documents\MyPets.error',
FORMATFILE = 'c:\Users\
BDCOOPER\Documents\MyPets.xml'
)
GO
select top 10 * from MyPets

The "LASTROW" is useful for debugging.  If you process dies in the middle of your file, set LASTROW to 2 and see what is being imported into your database.  Usually it's just off by a column or two.

ERRORFILE can be useful for viewing errors, but its also a pain because enterprise manager can retain a lock on it.

If all goes well you will get your text data imported into SQL.



Age   Animal       Name
 1    Dog           Rover              
82    Turtle        Yurtle             
 5    Llama         Rama                



You can also use BULK INSERT to read comma separated files by defining a field separator.

Enjoy.

No comments: