Validating URLs

by Mark G. Wiseman

Writing programs which utilize the Internet has become very popular. There are numerous freeware, shareware and commercial libraries and components that can be used to make programming for the Internet easier. However, did you know you may already have a good library of Internet functions?

The Windows’ Internet API

If you have Microsoft Internet Explorer on your computer, you have the Windows’ Internet API. This API is housed in the WININET.DLL file and is defined in the WININET.H header file. These are the same Internet access functions that Internet Explorer uses.

As far as I could tell there is no documentation on the API included with C++Builder. However, you can find everything you need to know on Microsoft’s MSDN web site. Here is a good link to start with: msdn.microsoft.com/workshop/
networking/wininet/reference/
functions/general.asp.

You can use the Windows’ Internet API in your code by including the WININET.H file and linking in the WININET.LIB file. Both of these files are included with C++Builder. As I mentioned before, you will need Internet Explorer for the WININET.DLL file. It is part of Internet Explorer 4.0 or later.

Using the API

Let’s try out the API, by writing a simple function to validate a URL. The ValidURL() functions takes an AnsiString argument containing a URL to be tested and returns true if the URL is a valid URL or false if it is not.

The code for the ValidURL() function can be found in Listings A and B. As you can see, it is really very simple.

Listing A: ValidURL.h.

#ifndef ValidURLH
#define ValidURLH


bool ValidURL(String url);


#endif   // ValidURLH

Listing B: ValidURL.cpp.

#include <vcl.h>
#pragma hdrstop

#include "ValidURL.h"

#include <wininet.h>


bool ValidURL(String url)
   {
   bool result = false;

   HINTERNET hSession = InternetOpen("ValidURL", INTERNET_OPEN_TYPE_PRECONFIG, 0, 0, 0);
   if (hSession != 0)
      {
      HINTERNET hFile = InternetOpenUrl(hSession, url.c_str(), 0, 0, INTERNET_FLAG_RELOAD, 0);
      if (hFile != 0)
         {
         int code = 0;
         DWORD codeLen = sizeof(int);
         HttpQueryInfo(hFile, HTTP_QUERY_STATUS_CODE | HTTP_QUERY_FLAG_NUMBER, &code, &codeLen, 0);

         result = code == HTTP_STATUS_OK || code == HTTP_STATUS_REDIRECT;

         InternetCloseHandle(hFile);
         }

      InternetCloseHandle(hSession);
      }

   return(result);
   }

#pragma package(smart_init)

The ValidURL() function makes four calls into the Windows’ Internet API, InternetOpen(), InternetOpenUrl(), HttpQueryInfo() and InternetCloseHandle(). Be sure to look at the documentation for these four functions on the MSDN website previously mentioned.

First, ValidURL() calls the InternetOpen() function. This function initializes the Internet API for use by your application. It takes several arguments and returns an HSESSION, a handle to an Internet session. If this return value is zero, then InternetOpen() has failed for some reason. If a valid session handle is returned, you must remember to close it when you are finished with it. You do this by calling InternetCloseHandle().

The ValidURL() function tests for a good session handle. If there was an error and the session handle is equal to zero, ValidURL() reports an invalid URL. This is technically not correct, since the actual URL was not tested. If you would like to make this function more robust, you could throw an exception in ValidURL() instead of returning false.

If InternetOpen() returns a valid session handle, ValidURL() then calls InternetOpenUrl(). The InternetOpenUrl() function can open resources on URLs for FTP, Gopher or HTTP. Like InternetOpen(), the InternetOpenUrl() function returns a non-zero handle that must be closed when no longer needed.

If the value returned by InternetOpenUrl() is zero, an error has occurred and ValidURL() will report an invalid URL. If the return value is a valid handle, ValidURL() then calls HttpQueryInfo().

The HttpQueryInfo() function specifically tests the URL. One of the arguments to HttpQueryInfo() is a pointer to an int variable named code. The value placed in this variable by HttpQueryInfo() should be either HTTP_STATUS_OK or HTTP_STATUS_REDIRECT if the URL is valid.

Conclusion

The Bridges Publishing web site has a small application that uses the ValidURL() function. Admittedly, the ValidURL() function in this article is probably the simplest Internet program you can write. However, it is a useful function and a great way to get your feet wet in Internet programming.