February 1998

Replacing variables in HTML documents

by Mark G. Wiseman

I recently helped construct a new Web site for my company. Early in the project, we realized we could reduce the time spent maintaining our Web site by automating some of the tasks. We noticed that, over time, the structure of many Web pages remained the same--only portions of the text and some of the graphics had to be changed. What we needed was a program that would search through an HTML file and replace variables with updated values. In this article, we'll create a Replace.exe program, shown in Figure A, to do just that.

Figure A: Our Replace.exe program looks like this in action.
[ Figure A ]

Inputs and outputs

Listing A contains a simple HTML input file with the following four variables:


* <!--VAR:Site-->
* <!--VAR:Link-->
* <!--VAR:Saying-->
* <!--VAR:Today-->

We decided to wrap our variable names with the HTML comment delimiters <!-- and -->. By making the variable names HTML comments, we can use HTML editors on our files. We also decided to begin each variable name with VAR: to make the variables easier to distinguish from actual HTML comments. Replace.exe doesn't require either of these conventions, though, so you can name your variables differently if you desire.

Listing A: HTML source file

<html>
<head>
<title>Really Great Links</title>
</head>

<body bgcolor="#FFFFFF">

<h1>Really Great Links</h1>

<hr>

<h3>The following links will take you to 
  <!--VAR:Site--> and other great sites.</h3>

<p><!--VAR:Link--></p>

<p><a href="http://www.borland.com">Borland</a></p>

<hr>

<table border="0" width="100%">
   <tr>
      <td valign="bottom">Saying for the day: 
        <!--VAR:Saying--></td>
      <td align="right" valign="bottom"><h5>
        Last updated: <!--VAR:Today--></h5></td>
   </tr>
</table>

<p align="left">&nbsp;</p>
</body>
</html>

Replace.exe replaces the variable names using the values in the variable file shown in Listing B, then produces the HTML output file shown in Listing C. Note that variables don't have to be text--they can be links to other pages, such as <!--VAR:Link-->, links to graphics files, or even HTML formatting codes.

Listing B: Variables file

<!--VAR:Site-->=The Cobb Group
<!--VAR:Link-->=<a href="http://www.cobb.com/">
  The Cobb Group</a>
<!--VAR:Saying-->=A stitch in time saves nine.
<!--VAR:Today-->=February 1, 1998
Listing C: HTML output file
<html>
<head>
<title>Really Great Links</title>
</head>

<body bgcolor="#FFFFFF">

<h1>Really Great Links</h1>

<hr>

<h3>The following links will take you to 
  The Cobb Group and other great sites.</h3>

<p><a href="http://www.cobb.com/">
  The Cobb Group</a></p>

<p><a href="http://www.borland.com">Borland</a></p>

<hr>

<table border="0" width="100%">
   <tr>
      <td valign="bottom">Saying for the day: 
        A stitch in time saves nine.</td>
      <td align="right" valign="bottom"><h5>
        Last updated: February 1, 1998</h5></td>
   </tr>
</table>

<p align="left">&nbsp;</p>
</body>
</html>
This example really shows off the power of C++Builder. By using two classes, AnsiString and TStringList, you can--in only a dozen or so lines of code--replace all the variables in an HTML file.

Tip: Tying strings together

If you aren't quite sure how to use those AnsiStrings, see the article "An AnsiString Class Reference" in the August 1997 issue of C++Builder Developer's Journal.

Here's how

Listings D and E contain all the code you need. One method of the TMainForm class, ProcessClick(), does nearly everything by itself.

Listing D: Replace header

#ifndef MainH
#define MainH

#include <vcl\Classes.hpp>
#include <vcl\Controls.hpp>
#include <vcl\StdCtrls.hpp>
#include <vcl\Forms.hpp>
#include <vcl\ComCtrls.hpp>
#include <vcl\ExtCtrls.hpp>
#include <vcl\Dialogs.hpp>

class TMainForm : public TForm
   {
   __published:
      TEdit *InputEdit;
      TEdit *OutputEdit;
      TEdit *VariablesEdit;
      TButton *InputButton;
      TLabel *Label;
      TButton *OutputButton;
      TButton *VariablesButton;
      TButton *ProcessButton;
      TButton *CloseButton;
      TProgressBar *ProgressBar;
      TBevel *Bevel;
      TOpenDialog *InputFileDialog;
      TSaveDialog *OutputFileDialog;
      TOpenDialog *VariablesFileDialog;

      void __fastcall CloseClick(TObject *Sender);
      void __fastcall ProcessClick(TObject *Sender);
      void __fastcall InputButtonClick(
        TObject *Sender);
      void __fastcall OutputButtonClick(
        TObject *Sender);
      void __fastcall VariablesButtonClick(
        TObject *Sender);

   public:
      __fastcall TMainForm(TComponent* Owner);
   };

extern TMainForm *MainForm;

#endif
Listing E: Replace source
#include <vcl\vcl.h>
#pragma hdrstop

#include "Main.h"

#include <memory>

using namespace std;

#pragma resource "*.dfm"
TMainForm *MainForm;

__fastcall TMainForm::TMainForm(TComponent* Owner) : 
  TForm(Owner)
   {
   Caption = Application->Title;
   }

void __fastcall TMainForm::CloseClick(
  TObject *Sender)
   {
   Close();
   }

void __fastcall TMainForm::ProcessClick(
  TObject *Sender)
   {
   auto_ptr<TStringList> buffer(new TStringList);
   buffer->LoadFromFile(InputEdit->Text);
   String temp = buffer->Text;

   auto_ptr<TStringList> variables(new TStringList);
   variables->LoadFromFile(VariablesEdit->Text);

   for (int i = 0; i < variables->Count; i++)
      {
      String var = variables->Names[i];
      String val = variables->Values[var];

      int len = var.Length();
      int loc = temp.Pos(var);

      while (loc)
         {
         temp.Delete(loc, len);
         temp.Insert(val, loc);
         loc = temp.Pos(var);
         }

      ProgressBar->Position = (TProgressRange)((
        100 * i) / (variables->Count - 1));
      }

   buffer->Text = temp;
   buffer->SaveToFile(OutputEdit->Text);

   Application->MessageBox("Process Complete", 
     Application->Title.c_str(), MB_OK);
   ProgressBar->Position = 0;
   }

void __fastcall TMainForm::InputButtonClick(
  TObject *Sender)
   {
   if (InputFileDialog->Execute()) InputEdit->Text = 
     InputFileDialog->FileName;
   }

void __fastcall TMainForm::OutputButtonClick(
  TObject *Sender)
   {
   if (OutputFileDialog->Execute()) OutputEdit->
     Text = OutputFileDialog->FileName;
   }

void __fastcall TMainForm::VariablesButtonClick(TObject *Sender)
   {
   if (VariablesFileDialog->Execute()) 
     VariablesEdit->Text = 
     VariablesFileDialog->FileName;
   }

First you create buffer--a TStringList--and use it to load the input file specified in the input edit control. You then copy all the text in buffer into an AnsiString string named temp.

Next, you load the variable file into variables, another TStringList. Each line of text in variables represents a variable name-value pair. The pair must be a single line of text, and the name must be separated from the value by an equal sign, like this:

name=value
You don't have to parse variables, because TStringList will automatically separate the variable name and its value when you use the properties Names and Values. You cycle through each name-value pair in variables using a for loop. Within the for loop, you use a while loop and the AnsiString methods Pos(), Delete(), and Insert() to replace each occurrence of a variable name in temp with its value. Finally, you copy temp back into buffer so you can use the TStringList method SaveToFile() to write your output file to disk.

To take advantage of the Windows dialog boxes for opening and saving files, you add three very short methods: InputButtonClick(), OutputButtonClick(), and VariablesButtonClick(). Well, their names may not be short, but the code in each method is only one line. These methods are assigned to the buttons next to the input, output, and variable file edit controls. To be really clever, you assign these methods to the respective OnDblClick events for the edit controls and their labels.

That's it. Amazing! C++ Builder and the VCL do everything else for you. The VCL even takes care of error handling--try entering a blank or bad filename and see what happens.

After replacement

While looking at the code for Replace.exe, you might think that this program will work on any text file, not just HTML. Well, you're right.

Without any changes, the code in this article compiles to a very useful program. However, you might want to modify Replace.exe to accept command-line arguments for the input, output, and variable filenames. Doing so would allow you to process files in batch mode. If you make this modification, you should probably let the user interrupt the replacement process before completion.

If you'd like to see how my company has used Replace.exe, check out the Corporate Art Gallery at

www.cosolutions.com

The entire gallery is periodically generated at random from a larger group of image files. Visitors to our gallery get to see different pictures, and we don't have to edit a single line of HTML to change them.