Occasionally you need to import data from a fixed format file. "BULK INSERT" provides an easy way to do that.
Suppose you have a three line column-formatted file of pet information as shown below
(the first two rows are for column description and do not appear in the file):
0 1 2 3 4 5
12345678901234567890123456789012345678901234567890
CC 1Dog Rover United States
CC 82Turtle Yurtle Germany
CC 5Llama Rama Peru
And you want to put them in this table
CREATE TABLE MYPETS
(
[Age] int NOT NULL,
[Animal] CHAR(10) NOT NULL,
[Name] VARCHAR(30) NOT NULL
)
GO
You first need to create an xml description of the data.
In the "RECORD" you specify which columns hold which data fields. The "ROW" will describe how to stuff the fields into the table.
A few notes:
You don't have to import all the fields. You can skip them by not putting them in the "ROW" section.
You do need to describe the entire row. It's a little tricky for the length of the RECORD because of "end-of-line" characters. Are there 1 or 2 terminating characters?
If you get errors like this:
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 (Name).
You may need to experiment with the length of the final column.
When parsing integers I found that its easiest to convert all the "INT" columns to "CHAR" and look at the output. Often its off by one or two characters. Then convert back to INT.
Here's our xml format file:
<BCPFORMAT
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="2"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="3"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="8"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="20"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="20"/>
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="AGE" xsi:type="SQLINT"/>
<COLUMN SOURCE="3" NAME="ANIMAL" xsi:type="SQLCHAR"/>
<COLUMN SOURCE="4" NAME="NAME" xsi:type="SQLCHAR"/>
</ROW>
</BCPFORMAT>
Finally to put it all together, use the "BULK INSERT" command to read the document and use your xml format file:
BULK INSERT MYPETS FROM 'c:\Users\BDCOOPER\Documents\mypets.txt'
WITH
(
FIRSTROW = 1, -- skip to this row increase of header rows
--LASTROW = 12, --optional used for debugging
ROWTERMINATOR = '\n',
--ERRORFILE = 'c:\Users\BDCOOPER\Documents\MyPets.error',
FORMATFILE = 'c:\Users\BDCOOPER\Documents\MyPets.xml'
)
GO
select top 10 * from MyPets
The "LASTROW" is useful for debugging. If you process dies in the middle of your file, set LASTROW to 2 and see what is being imported into your database. Usually it's just off by a column or two.
ERRORFILE can be useful for viewing errors, but its also a pain because enterprise manager can retain a lock on it.
If all goes well you will get your text data imported into SQL.
Age Animal Name
1 Dog Rover
82 Turtle Yurtle
5 Llama Rama
You can also use BULK INSERT to read comma separated files by defining a field separator.
Enjoy.
No comments:
Post a Comment