Gregory Deatz shares his expert knowledge of writing User-defined Functions for InterBase
(Paper delivered at the Borland Developers' Conference, San Diego, July 2000)

Introduction

What is a UDF?

A user defined function (UDF) in InterBase is merely a function written in any programming language that is compiled into a shared library. Under Windows platforms, shared libraries are commonly referred to as dynamic link libraries (DLL's).

Why write them?

After all, stored procedures can accomplish quite a bit on their own.

The truth of the matter is, InterBase does not come with a very rich set of built-in functions. Some common functions that are missing are modulo arithmetic, floating point formatting routines, date manipulation routines and string manipulation routines.

It just so happens that programming languages like Delphi and C can produce amazingly fast code to do modulo arithmetic, and other various date processing, floating point formatting and string manipulation routines.

It's also a well known fact that writing UDFs is an insanely easy task; however, the inexperienced DLL/shared library writer might be uneasy and uncomfortable with some of the requirements...

Some do's, some don't's

Before we start going through some examples of writing UDFs, let's talk about what you should be doing, and what you should not be doing.

Once you get the hang of writing UDFs, you will probably think that a whole world of InterBase extensibility has opened up to you through UDFs.

On the one hand, it has... The mechanisms for invoking UDFs are quite simple, and since a UDF is simply a routine written in your favorite programming language, you can do virtually anything, right?

Well, yes and no... One thing you can't do with UDFs: You can't pass NULLs to them. Likewise, a UDF cannot return a NULL value. Also, a UDF does not work within the context of a transaction. That is, transaction level information cannot be passed to a UDF, and therefore, a UDF isn't able to "dig back" to the database.

Sort of. A UDF can establish a new connection to the database and start another transaction if it so desires, but this is where we come to the "do's and don'ts", not to the "can'ts".

When you write UDFs, you should follow these two simple rules:
  • A UDF should be a simple, quick function.
  • A UDF should not attempt to directly affect the state of the database.
What does this mean?

Well, a function that trims a string, performs modulo arithmetic, performs fancy date arithmetic or evaluates aspects of dates are all nice, simple, quick functions. They are good examples of candidate UDFs.

Now, a function that attaches to a database, and inserts, deletes or updates data is probably a bad idea. A function that launches a program that performs a series of complex tasks is probably a bad idea. Why? Quite simply because these types of functions might stop a database from (a) doing transactional stuff or (b) even worse, they could significantly damage the performance of your server: As soon as a UDF is called, the thread that called that UDF blocks until the UDF returns.

Remember, of course, that these are general guidelines. Your particular business case might dictate a need to do something that is generally bad because in your case it is specifically good, kind of like a glaucoma patient smoking... O yeah, stay on topic now.

Let's get to the heart of the discussion, now!

return to top

Writing UDFs in Delphi for Windows platforms

Start a Delphi project

Start a Delphi DLL project (a special type of project, when you click "F"ile "N"ew).

Create a new unit for your functions
  • Now, do "F"ile, "N"ew... UNIT.
  • It might be wise to do a Save All at this point... put your project where you feel is a good spot...
Create a modulo routine

In the newly created unit:
  • Declare the routine in the interface section:
function Modulo(var i, j: Integer): Integer; cdecl; export;
  • Implement the routine in the implementation section
function Modulo(var i, j: Integer): Integer;
begin
if (j = 0) then
result := -1 // just check the boundary condition, and
// return a reasonably uninteresting answer.
else
result := i mod j;
end;

 
In the newly created project source:
  • Type the following immediately preceding the "begin end.":
exports
Modulo;

Build it, use it

So, build the project, and you now have a working DLL.

Now, I'm only going to mention this once: This simplest way to get InterBase to appropriately find the DLL is to copy it to the UDF directory under the InterBase Installation, which may be:
c:\Program Files\Borland\IntrBase\UDF.

To use the UDF.... do the following:
  • Connect to a new or existing database using ISQL.
  • Type the following:
declare external function f_Modulo
integer, integer
returns
integer by value
entry_point 'Modulo' module_name 'dll name minus ".dll"';
  • Commit your changes.
  • Now test it...
select f_Modulo(3, 2) from rdb$database

Whew! That was really easy, wasn't it?

But what about strings and dates?

What about 'em anyway? A string and a date are not considered "scalar" values in InterBase-ese, so special care must be taken when returning their values to InterBase.

Let's build a "Left" routine

Memory allocation issues
  • If you have IB 5.1 or lower, then type the following declaration in the interface section of your unit:
function malloc(Bytes: Integer): Pointer; cdecl; external 'msvcrt.dll';
  • If you have InterBase 5.5 or better, then you don't really need this bit of coding magic. Instead, make sure that the ib_util.pas file is in your compiler search path, and that ib_util.dll is in your real search path.
The simplest way to do this is to put
 
c:\Program Files\InterBase Corp\InterBase\include
 
in Delphi's library search path. Then, copy
 
c:\Program Files\InterBase Corp\InterBase\lib\ib_util.dll

to your windows system directory (typically c:\Windows\System).
  • Then, put ib_util.pas in your uses clause of your interface section.
uses
...,
ib_util;

What are all these strange memory allocation details? Why can't I just allocate memory with AllocMem, or whatever? The simple answer is: You can't, so don't ask! The more complicated answer is that every compiler uses it's own favorite algorithms for managing memory that has been given to it by the OS. As an example, MSVC manages memory differently from Delphi. Guess what? IB is compiled with MSVC. In pre-5.5 version, you have to link directly to the MS VC Run-time DLL, and in post-5.5 days, InterBase gives you an IB call to make this possible. So, let's get on with building a string-ish function!

Building the function
  • In the interface section of the newly created unit, type the following declaration:
function Left(sz: PChar; Cnt: Integer): PChar; cdecl; export;
  • In the implementation section of the unit, type:
(* Return the Cnt leftmost characters of sz *)
function Left(sz: PChar; var Cnt: Integer): PChar;
var
i: Integer;
begin
if (sz = nil) then
result := nil
else begin
i := 0;
while ((sz[i] <> #0) and (i < cnt)) do Inc(i);
result := ib_util_malloc(i+1);
Move(sz[0], result[0], i);
result[i] := #0;
end;
end;
  • In your project source, in the "exports" section, type:
exports
Modulo,
Left;
  • Now, build the project again...
  • Now, to use the routine, go to ISQL, and reconnect to the database you used above, and type:
declare external function f_Left
cstring(64), integer
returns cstring(64) free_it
entry_point 'Left' module_name 'dll name minus ".dll"';
  • And test it...
select f_Left('Hello', 3) from rdb$database

Still pretty simple, huh?

Let's build some date routines
  • In InterBase 6, three different "date" types are supported, DATE, TIME, and TIMESTAMP. For those familiar with InterBase 5.5 and older, the TIMESTAMP data type is exactly the equivalent of the 5.5 DATE type.
  • In order to "decode" and "encode" these types into something meaningful for your Delphi program, you need a "little bit" of information about the InterBase API. In your unit, put the following code:
interface
...
type
TM = record
tm_sec : integer; // Seconds
tm_min : integer; // Minutes
tm_hour : integer; // Hour (0--23)
tm_mday : integer; // Day of month (1--31)
tm_mon : integer; // Month (0--11)
tm_year : integer; // Year (calendar year minus 1900)
tm_wday : integer; // Weekday (0--6) Sunday = 0)
tm_yday : integer; // Day of year (0--365)
tm_isdst : integer; // 0 if daylight savings time is not in effect)
end;
PTM = ^TM;

ISC_TIMESTAMP = record
timestamp_date : Long;
timestamp_time : ULong;
end;
PISC_TIMESTAMP = ^ISC_TIMESTAMP;

implementation
...
procedure isc_encode_timestamp (tm_date : PTM;
ib_date : PISC_TIMESTAMP);
stdcall; external IBASE_DLL;
procedure isc_decode_timestamp (ib_date: PISC_TIMESTAMP;
tm_date: PTM);
stdcall; external IBASE_DLL;
procedure isc_decode_sql_date (var ib_date: Long;
tm_date: PTM);
stdcall; external IBASE_DLL;
procedure isc_encode_sql_date (tm_date: PTM;
var ib_date: Long);
stdcall; external IBASE_DLL;
procedure isc_decode_sql_time (var ib_date: ULong;
tm_date: PTM);
stdcall; external IBASE_DLL;
procedure isc_encode_sql_time (tm_date: PTM;
var ib_date: ULong);
stdcall; external IBASE_DLL;
  • Now, let's write some date UDFs!
  • In the interface section of the newly created unit, type the following declaration:
function Year(var ib_date: Long): Integer; cdecl; export;
function Hour(var ib_time: ULong): Integer; cdecl; export;
  • In the implementation section of the unit, type:
function Year(var ib_date: Long): Integer;
var
tm_date: TM;
begin
isc_decode_sql_date(@ib_date, @tm_date);
result := tm_date.tm_year + 1900;
end;

function Hour(var ib_time: ULong): Integer;
var
tm_date: TM;
begin
isc_decode_sql_time(@ib_time, @tm_date);
result := tm_date.tm_hour;
end;
  • In your project source, in the "exports" section, type:
exports
Modulo,
Left,
Year,
Hour;
  • Now, build the project again...
  • Now, to use the routine, go to ISQL, and reconnect to the database you used above, and type:
declare external function f_Year
date
returns integer by value
entry_point 'Year' module_name 'dll name minus ".dll"';

declare external function f_Hour
time
returns integer by value
entry_point 'Hour' module_name 'dll name minus ".dll"';
  • And test it...
select f_Year(cast('7/11/00' as date)) from rdb$database

Not quite as easy as string or number manipulations, but still pretty simple, huh?

return to top

Writing UDFs for Linux/Unix platforms

Most Linux novices that have been initiated into programming in the Windoze world will probably feel a bit intimidated by the notion of "writing a UDF for Linux/Unix platforms".

The process couldn't be simpler, and, in some respects, I find it easier and "more intuitive" than doing it under windows platforms.

Whenever you compile a C-File, it creates an "object" file, which is something that will be statically linked to some other code during a "linking" phase, which generally produces an executable file of some form.

It's during this "linking" phase that we are going to tell the c-compiler to create a "shared library", which is essentially a "shared object" file that can be dynamically linked to a program at run-time, not at compile-time.

In Windoze-ese, we call these "things" Dynamically Linked Libraries, because the library of functions is "linked" to the executable dynamically, at run-time. Thus we arrive at the "DLL" extension for these files.

In Unix/Linux-ese, we call these "things" shared libraries, which are essentially "shared object" files--libraries of code that can be dynamically linked at run-time. Thus we arrive at the conventional "so" extension for these files.

You need to remember that there is nothing inherently different between a Linux "Shared Library" and a Windows "DLL". They are the same thing, at least in concept.

So.... how do we create a shared library under Linux?

Create a C-file

This much is easy, right? Just open a text file with a .c extension.

Create the modulo routine

int modulo(int *, int *);

int modulo(a, b)
int *a;
int *b;
{
if (*b == 0)
return -1; // return something suitably stupid.
else
return *a % *b;
}

Build it, use it

At the command-line
gcc -c -O -fpic -fwritable-strings <your udf>.c
ld -G <your udf>.o -lm -lc -o <your udflib>.so
cp <your udflib>.so /usr/interbase/udf

In ISQL
declare external function f_Modulo
integer, integer
returns
integer by value
entry_point 'modulo' module_name 'name of shared library';

commit;

select f_Modulo(3, 2) from rdb$database;

Holy guacamole, Batman! That was really easy.

Now, instead of going through the motions of writing the other routines, I'll leave it as an exercise for the reader. HOWEVER, if there proves to be time in the lecture, I will go through more examples.

return to top

Conclusions

Wow! Writing UDFs is really easy--there isn't much to it--and look! Linux development ain't so difficult after all.

And, of course, we can make the following brain-dead conclusions about developing UDFs for InterBase:

They are easy. There is no excuse for not building them if you need them.

Don't get carried away. As powerful as UDFs can be, don't get carried away. Be very objective when deciding where you should place little tidbits of functionality: Am I better served by a UDF or a stored procedure?

And that, my friends, is UDF development.

For even more examples and happy-fun programming, download FreeUDFLib, a Delphi UDF library for InterBase and FreeUDFLibC a C based UDF library for InterBase that runs on Solaris, Linux, Windows, etc... at InterBase.

return to top