August 1998

DB->dilemmas

by Jim Bailey

Sample files illustrating this article's examples are available at www.cobb.com/cpb. Click the Source Code link, then download aug98.zip. Unzip the archive, and you'll find these files in db.zip.

 

Text tables 2

Your article on text tables alone was worth the annual subscription ["DB-> Dilemmas," April 1998]. I'm considering using the TBatchMove object to populate my tables. I'd like to see a related article.

Fred C. Mitchell
Princeton, NJ

Thanks, Fred. I received several responses like yours. Importing text files is a definitely a trouble spot, primarily because the documentation is so weak. I've recently worked with the TBatchMove object and found it to be a superb tool. In this column, I'll show you how to build a utility to import text files using the TBatchMove component. Along the way, I'll discuss an easier way to write schema files.

My motivation

Recently, I was retained for a project that involved porting data from an old PC-based Xenix system to an AS 400. Retrieving data from the old system was apparently impossible. It was a DB->Dilemma I'd never seen before. Fortunately, I eventually stumbled onto the Xenix print queue and managed to export the files to reports. The reports weren't organized in predictable rows and columns, but in complex, irregular sections. Even so, I finally managed to parse the report files into text tables and began the process of exporting text files into Paradox.

On my first attempt to port the data, I opened the text file as a table and used a TQuery object with a parameterized INSERT query to populate the Paradox table. Much to my surprise, the process took hours--and there were only 10,000 records.

To get the data into a Paradox database, I decided to use the TBatchMove object, hoping it would offer reasonable performance. I was astonished at the results. Initially, I thought the command had failed because the entire text table was copied in eight or nine seconds!

 

Scheming to save time and effort

In the process of porting this data, I got sick of writing schema files. It's definitely a boring, error-prone task. I decided to develop a text-porting tool to enter the structure of the text table and let the porting tool create the schema file and database table. I set up the following simple C structure named fld to define text table fields:

 

struct fld {
  char FieldName[25];
  int FieldLen;
  char type;  //Field type
  };

Completing this structure is much simpler than working with schema files. For instance, if the text table's structure changes, every line in a schema file will probably be affected. Using the declaration shown in Listing A, you need to know only the width of each field. If it changes, you just update the width of that one field. Figure A shows a listing of open invoices; the data structure in Listing A defines every field in Figure A.

Listing A: The declaration to set up a field structure

#define CHAR_TYPE      'c' //code value
#define DATE_TYPE      'd'
#define SHORT_TYPE     'i' //16 bit integer
#define CURRENCY_TYPE  'm'
#define PAD_TYPE       'p'
#define LAST_INSTRUX   'z'

fld f[]=
{ //fName       len     Type
  {"p1",         3,     PAD_TYPE},
  {"InvoiceNo",  6,     CHAR_TYPE},
  {"p2",         1,     PAD_TYPE},
  {"BranchID",   2,     SHORT_TYPE},
  {"p3",         1,     PAD_TYPE},
  {"Description",15,    CHAR_TYPE},
  {"SoldBy",     2,     SHORT_TYPE},
  {"InvoiceCode",1,     CHAR_TYPE},
  {"InvoiceDate",12,    DATE_TYPE},
  {"Reference",  7,     CHAR_TYPE},
  {"InvoiceAmt", 8,     CURRENCY_TYPE},
  {"eof",        0,     LAST_INSTRUX}
};
String TextTableName("Aging");
String DBTableName(TextTableName);
Figure A:
R02502 01 INVOICE 1 R 10/09/97 L03453 2418.75

R02502 01 W/O INSURANCE 1 R 10/13/97 W/O INS 180.98-

R02502 01 LOCKBOX BRANCH 1 R 11/05/97 CK15246 2413.12-

R02503 01 INVOICE 1 R 10/09/97 L03454 1666.25

This text file lists invoices for an aging statement.

In Listing A, each text table field is defined with three columns: the field name, the field width, and the field type. These parameters apply only to the text table. You just count the number of characters in each field. If you don't want to keep the field, mark it as a PAD_TYPE; otherwise, select the appropriate type, and you're home free.

What is PAD_TYPE? Remember, Figure A is a printed report file--the first three characters on every line are spaces. Creating a PAD_TYPE field allows the text-porting tool to skip over certain data in fixed-field-width text tables. Compare Listing A to Figure A to see how to use the pad fields to strip empty columns effectively.

 

About the application

The text-porting tool creates the schema and table based on the information in the fld structure shown in Listing A. Then, the porting tool can populate the table it creates with data from the text file. If you do much work with text tables, you'll want to download the sample text table tool application from www.cobb.com/cpb. The application relies on four key functions to do most of the work. Those functions are as follows:

 

void BuildSchema(fld f[], 
  String &TextTableName, 
  const char SCHEMA_TYPE, bool Debug);

void BuildTable(fld f[], String &TableName, 
  bool Debug);

void CopyTextToTable(String &TextTableName, 
  String &DBTableName);

void ViewTable(String NewFile);
Using only the information in the fld structure, these functions create a schema, set up a database table, copy the text table to the database table, and link the data to a TDBGrid for immediate viewing. Using the fld structure is remarkably fast and much easier to maintain than working directly on the schema files and SQL commands. Figure B shows the control panel for the text-porting tool.

Figure B: Select your options from the text-porting application main screen.
[ Figure B ]

The application consists of a radio group, a rich edit text box, a grid, three check boxes, and a command button. The grid and rich edit text box occupy the same space on the screen.

The radio group object labeled Command specifies what action is taken after you click the Execute Command button. In this application, the button seems a bit of overkill; but in reality, I put several buttons on the screen, each processing its own text table.

The three check boxes are extremely helpful. When Debug is checked, the commands don't create a schema file or a table. However, the commands (which would be written to a file if Debug wasn't checked) are visible in the rich edit component.

The All Char check box forces every field to type CHAR (text). This process is often helpful when debugging a data port. The View Grid component shows or hides the grid, based on the value of the check box. I didn't include much code in this article because of space limitations, but the complete source is available in the sample application.

 

Logic anomalies

I ran across some interesting behavior while developing the BuildSchema and BuildTable functions: The schema field types are mentally incompatible (at least in my mental resources). For example, in a schema file, field type NUMBER is a short integer, while SQL typically defines NUMBER as a double value. I set up a system to cross-reference field types between the schema and SQL, as shown in Table A.

Table A: Field-type specification cross-references

Flg

Macro name

Schema

SQL

'c'

CHAR_TYPE

CHAR

CHAR

'd'

DATE_TYPE

DBIDATE

DATE

't'

TIME_TYPE

TIME

TIME

'D'

DATETIME_TYPE

TIMESTAMP

TIMESTAMP

'm'

CURRENCY_TYPE

FLOAT

MONEY

'f'

FLOAT_TYPE

FLOAT

FLOAT

'i'

SHORT_TYPE

NUMBER

SMALLINT

'l'

LONG_TYPE

LONGINT

INTEGER

'b'

BOOL_TYPE

BOOL

BOOLEAN

I tested the SQL types using Paradox and Access 97. All the types are supported in both environments except BOOLEAN. The MS Access Boolean specifier is LOGICAL and doesn't accept BOOLEAN as a field type in a CREATE TABLE statement.

Using a table similar to the one shown in Table A adds considerable flexibility to your application. Modifications to the program's macros accommodate requirements across multiple databases. Simply changing BOOLEAN to LOGICAL in the source code allows you to create an Access table instead of Paradox.

Double mapping also provides flexibility. For example, the schema specification doesn't support a currency field type. I wanted to create a database table using currency fields. So, I simply mapped the schema's FLOAT type twice--once as FLOAT and once as MONEY. When creating the schema file, the program uses FLOAT for both. However, the SQL command uses either FLOAT or MONEY, as specified.

TBatchMove

The TBatchMove component is simple to use. It requires two tables--a text table and a database table. You should set up the TDatabase and/or any aliases before trying to execute this code. (Since we've already covered these topics in detail in previous issues, we won't repeat them now.) Once you've set up all the databases, the code shown in Listing B executes the copy from the text table to the database table.

Listing B: Code required for porting data from a text table to a database table

  String dbTable(DBTableName);
  dbTable +=  DB_FILE_EXT;
  tblPdox->TableName = dbTable;
  tblTxt->TableName = TextTableName;
  dbTxt->Connected = true;
  dbPdox->Connected = true;
  tblTxt->Active = true;
  tblPdox->Active = true;
  BatchMove1->Execute();
  tblPdox->Active = false;
  tblTxt->Active = false;
  dbTxt->Connected = false;
  dbPdox->Connected = false;

The objects with tbl name prefixes are TTable objects, and the db prefixed objects are TDatabase objects. Honestly, there isn't much more we can say about using the TBatchMove object other than it works. It works fast--it makes my clients happy--and it makes me happy.

 

Conclusion

I've seen many questions concerning performance while appending new records using C++Builder. Indeed, at times I've experienced terrible performance myself. However, the text table-porting tool I've demonstrated here (including the TmoveBatch object), provides astonishing performance and is also easy to use. Keep in mind that the text-porting tool isn't designed for end users, but it has been a welcome edition to this developer's tool kit.