Washington Area Informix User Group Newsletter


Volume 4, No. 3 - July 1994


Highlights of This Issue

User Group Activities and Events

Second Annual Special Issue for New Members

This is the second year we have produced a special edition of our newsletter for our new members and for the Informix Worldwide User Conference in Tampa, Florida. In the last year we have grown from 200 to over 500 Informix users on our mailing lists. We held a one day Forum last November with 220 attendees, 16 speakers and 10 exhibitors. We had technical presentations on Informix 4GL programming tips, networking, Online DSA 6.0. Our quarterly meetings have included presentations on support services through Informix Link and Informix on Windows NT. This has been a very exciting year for our user group and we look forward to another challenging year. Our objective is to provide technical tips, new product information and the opportunity to network with other Informix Users.

The Informix User Group in the Washington area was started in May 1991. Nineteen people from a variety of companies and federal agencies met to hear about release 5.0 and make plans for a user group. At the meeting we identified the goals of the user group as to share information, learn about new products, provide technical input, discuss problems, share solutions, and network with other Informix Users. Our first newsletter was sent out in June 1991. This issue is also being mailed to many people who are not on our membership lists in the Maryland, Virginia and the District of Columbia. If you would like to receive regular mailings from our user group please send in the membership form on the last page.

Next Meeting

Our next meeting is Wednesday, August 24, 1994 from 5:00 to 9:00 pm. The meeting will be held at the Informix Office, 2110 Wilson Blvd., 9th Floor, Arlington, VA. Please note the new address and time for the meeting.

Agenda:

Newsletter Sponsorship

The user group has been supported by many companies over the years. If your company would like to sponsor a mailing of the newsletter please call for more information. We would like to thank the following companies for sponsoring this newsletter:

Last Meeting - Informix Database on Windows NT

Our May meeting was a joint meeting with the Advanced Systems User Group (Windows NT User Group). Pat Higbie, President of DataFocus, Incorporated, demonstrated Informix SE on Windows NT. DataFocus helped developed the POSIX subsystem in NT and ported Informix SE to Windows NT for Informix. They are experts in the area of UNIX porting and compatibility and have a product to port from Unix to NT. This was a rare behind-the-scenes look at the Informix database and Windows NT. Microsoft Corporation hosted the meeting.

Elections for the Board of Directors

At the August meeting we will be holding elections to the WAIUG Board of Directors. The Board is composed of volunteers who plan our activities and work hard at putting them together. If you are interested in volunteering to be on the Board, or would like to nominate a member, please contact the Secretary of the WAIUG, Rick Montgomery, at 703-756-7273.

User Group Membership

Our user group has grown to over 500 names on our mailing list this year. We are incorporated as a non-profit organization. Our objective is to share information, learn about new products, provide technical input, discuss problems, share solutions, and network with other Informix users. In order to cover our expenses and support continued growth, the board of directors decided to implement membership dues in 1994. The membership dues will be $20.00. Everyone who registered and paid for the User Group Forum in 1993 will automatically be covered as a member for one year. Membership will allow you to receive all our newsletters and a discount on our forum next year. For more information about membership, please call our Membership Director, John Petruzzi, at 703-490-4598


SQL Optimization - the Set Explain Command

by Kevin Fennimore

When performing queries on a database the engine must determine how to access the tables. This is the job of the optimizer. The optimizer determines the appropriate order to read from tables when performing joins and how to access the data whether it is via an index, sequentially or some other method. It analyzes various information about the table or tables involved in the query such as the size of the table, the index on the table and the selectivity of filters in the query (for a discussion of selectivity see the Guide to SQL Tutorial Manual).

The order that the optimizer will use to read from the tables involved in a join is called the query path. When three tables are joined the optimizer may decide to read from table 2, then table 3 and finally table 1. This table2-table3-table1 ordering is the query path. The right query path can prevent the engine from doing extra work by reducing the amount of data that is read to execute a join. This might be accomplished by reading from a smaller table first and then joining to a larger table. Consider the following query:

		SELECT * FROM table, table2
		WHERE table1.x = table2.y
Table1 has 100 records and table2 has 50,000 records. For every record we read from table1 (100 reads) we would read 50,000 records from table2 to satisfy the join (assuming no indexes on table2). This would be a total of 5,000,100 reads (100 table1 reads + 100 * 50,000 table2 reads). If, however, we read from table2 first (50,000 reads) and then from table1 (100 for each table2 record), we would perform 5,050,000 reads (50,000 table2 reads + 50,000 * 100 table2 reads). This order requires an additional 49,900 reads.

Now consider what would happen if we added indexes to the table. Let's assume that an indexed read into table1 requires 2 index reads and 1 data read for a total of 3 reads to get a particular value and table2 requires 4 reads (3 index reads and 1 data read). When we read from table2 first, we'd read each record from table2 and do an indexed read into table1. This would require 200,000 reads (50,000 reads from table2 + 50,000 * 3 reads from table1). But, when we read from table1 first and then do an indexed read into table2 we only need 500 reads (100 reads from table1 + 100 * 4 reads from table2).

As you can see the number of reads required can be reduced by just reading tables in the right order and even further reduced by adding indexes to the join columns. In addition to reading fewer records other savings can be realized. By reading rows from a table via an index, which will return rows sorted based on the index key, the need for additional sorting is eliminated.

In order to determine what the optimizer is doing and how data is being accessed, we can use the SET EXPLAIN ON command. This command is available in SQL, 4GL and any of the embedded-SQL languages. For example:

				SET EXPLAIN ON;
				SELECT 1;
				SELECT 2;
				SET EXPLAIN OFF;
When this set of SQL commands is run a query plan is output to the file sqexplain.out in the current directory. The file is created if it does not already exist and is appended to if it does exist. Following is an example of some queries run with the SET EXPLAIN turned on and an explanation of the output. Explanations are in italics.

All queries begin with the following line, followed by the query statement:

QUERY:
------
select * from stock order by description
The next two lines are common to all queries.

Estimated Cost: 6
The estimated cost is a number assigned to the query and is
primarily used as a comparison to other query paths.  When this
query is run over and over again this number may vary slightly. 
This number is useful when a change is made to the query (i.e.
adding an index) and it drops significantly. 

Estimated # of Rows Returned: 15
The estimated number of rows returned is only an estimate.  I have
seen cases where a query returns thousands of rows and this number
was only 2 or 3.

Temporary Files Required For: Order By  
This line reports if the optimizer has determined that a temporary
file is required.  In this example a temporary file is required for
the order by.  As you may recall from the "Indexing Strategies"
article, creating an index on the description field would eliminate
the need for this temporary table since the optimizer could use the
index to order the rows, as in the next example.  This would
eliminate the need for a temporary table but may not be desirable
since the description field is character.
The rest of the query plan lists the table access strategy.  This
is the order that the tables will be read and how they will be
accessed.
1) informix.stock: SEQUENTIAL SCAN
In the example above only one table is involved so it will be read first and it will be read via a SEQUENTIAL SCAN. In this case we are selecting all of the rows with no joins so other than adding the index for the order by there isn't much else we could improve.

QUERY:
------
select * from stock where unit_price>20 order by stock_num

Estimated Cost: 3
Estimated # of Rows Returned: 5

1) informix.stock: INDEX PATH
    Filters: informix.stock.unit_price > 20 
    (1) Index Keys: stock_num manu_code

This is an example (above) of the same query except it is ordered on stock_num and has a filter on unit_price. Notice that no temporary files are required and that the access this time is INDEX PATH and not SEQUENTIAL SCAN. We can also see the columns that make up the index used on the Index Keys line (stock_num, manu_code). This is a good example of the engine using the leading portion (stock_num) of a composite index (stock_num, manu_code). The Filters are applied to each row that is read from the database to determine if it meets the the query criteria.

QUERY:
------
select  manu_code from stock

Estimated Cost: 2
Estimated # of Rows Returned: 15

1) informix.stock: INDEX PATH
    (1) Index Keys: stock_num manu_code   (Key-Only)

For this query the optimizer uses an index path to read the information from the table. It uses the index because the optimizer is able to take advantage of the Key-Only read capabilities available with OnLine (this type of access is not available with the Standard Engine). A Key- Only read is when the engine reads all of the selected data from the key of an index. This can greatly reduce the number of reads required to satisfy the query. Since many keys can fit into one node, reading one node could yield many rows of data. Also note here that the engine is able to perform Key-Only reads for a column(manu_code) that is not the leading portion of a composite index as is the case when using composite indexes for joins and ordering.

QUERY:
------
select * from stock,items
where stock.stock_num=items.stock_num
and items.quantity>1

Estimated Cost: 9
Estimated # of Rows Returned: 22

When two or more tables are used in a query the tables are listed in the order in which they will be accessed. In this example we read from the stock table sequentially and then access the items table via the index on stock_num, manu_code. The filter is also applied on the quantity column of the items table.

1) informix.stock: SEQUENTIAL SCAN
2) informix.items: INDEX PATH
    Filters: informix.items.quantity > 1 
    (1) Index Keys: stock_num manu_code 
        Lower Index Filter: informix.items.stock_num =
informix.stock.stock_num 

When an indexed read is done it is necessary to find the first key value where the indexed read will begin. This starting position is defined in the Lower Index Filter. As we will see in the next example, there is also an Upper index Filter which defines the key value at which to stop reading the index.

QUERY:
------
select * from stock where stock_num>10
and stock_num<14

Estimated Cost: 1
Estimated # of Rows Returned: 1

1) informix.stock: INDEX PATH
    (1) Index Keys: stock_num manu_code 
        Lower Index Filter: informix.stock.stock_num > 10 
        Upper Index Filter: informix.stock.stock_num < 14 

Here we are doing an indexed read of the table and using the Lower and Upper Index Filters to satisfy the filter condition on stock_num.

QUERY:
------
select * from items,stock 
where items.total_price=stock.unit_price

Estimated Cost: 19
Estimated # of Rows Returned: 59

1) informix.items: SEQUENTIAL SCAN
SORT SCAN: informix.items.total_price 

2) informix.stock: SEQUENTIAL SCAN
SORT SCAN: informix.stock.unit_price 

MERGE JOIN
    Merge Filters: informix.stock.unit_price =
informix.items.total_price 

This example uses the concept of the Sort Merge Join to join the two tables together where no suitable index can be found. The SORT SCAN indicates that a sort will be done on the specified columns in preparation for a Merge Join. Then the MERGE JOIN indicates that the sorted column are to be merged. This type of joining is not very efficient since additional overhead is incurred during the sort and merge processes. This query is a prime candidate for adding indexes to the join columns to improve performance.

Note: The Sort Merge Join is only available in versions 5.0 and beyond. Prior to version 5.0 an AUTOINDEX PATH would have been used. This is even less efficient than the Sort Merge Join since it actually builds a temporary index on the columns being joined. Both of these access methods should be avoided whenever possible.

QUERY:
------
select * from qps@ctl0104:tord_stat

Estimated Cost: 2
Estimated # of Rows Returned: 7

1) qps@ctl0104:informix.tord_stat: REMOTE PATH
    Remote SQL Request:
    select x0.ord_stat_ind ,x0.ord_stat_desc ,x0.user_stamp
,x0.dt_t
    imestamp ,x0.trans_dt_timestamp from qps:"informix".tord_stat
x0

This query is accessing a table on a remote database server. This is the reason for the REMOTE PATH. The request that is actually sent to the remote server is also output. There is no way to access the query plan for the request on the remote server.

when using the SET EXPLAIN command you should keep in mind the affect of the UPDATE STATISTICS command on the optimizer. As the database changes the UPDATE STATISTICS command should be run occasionally to assure that the statistics used by the optimizer are up to date. Running update statistics on a table can completely change the query plan in the SET EXPLAIN output.

The SET EXPLAIN command is the best tool available for analyzing slow queries. But it can also be a useful tool in the development process. It can be used to double check that a query is using the indexes you think it should be using. You can also use the SET EXPLAIN command in a program to analyze all of the queries that are performed during the program's execution. This output can be reviewed in a code walk through or used to follow the logic of unfamiliar code. Development applications could include a call to a function that checks an environment variable and based on its value turns the SET EXPLAIN on. This way no recompilation is necessary to turn the explain feature on, simply change the environment variable. It is a good idea not to have the SET EXPLAIN turned on at all times since it will append to the sqexplain.out file in the current directory which could cause some rather large files.

Kevin Fennimore, UCI Consulting, 2413 Arbor Lane, Hillsborough, NC 27278, Phone: 919-644-1073


Informix 4GL Programming Tips

By Lester Knutsen

This article is a result of a presentation I did for the Washington Area Informix Users Group last March, on 10 Informix 4GL Programming Tips. Since then, I have expanded on the ideas with additional input from a second presentation for the Atlanta User Group, training seminars I have conducted, and a discussion in comp.databases.informix on the fastest way to perform an update. It has grown to 12 4GL tips, most of which also apply to SQL. Most of these tips are basic, and this is targeted for the new Informix 4GL programmer.

1. Only select columns that you really need

When you select data it goes from disk to memory. Ten un-needed bytes of data from 10,00 rows translates into reading 100,000 bytes unnecessarily. Avoid "SELECT *" statements and only select the fields you will be using in your program. Some of the biggest performance gains I have seen have come from eliminating unused fields from select statements. In an network environment this will also reduce network traffic.

2. Only select rows when they are really needed

Several programs I have done performance reviews on started out by selecting all the rows in a table, and making the user wait while this select was occurring. Even if the user only wants to locate a few records they have to wait until all records are located. Avoid select statements without a where clause, and statements that select rows a user may not need.

3. Use the dual cursor model for selecting and updating data

One approach is to use a dual cursor model. The first cursor selects all the rowids a user may need, and as the user picks a next row, a second cursor selects the columns of data from a row. I have an example program at the end of the article that uses the dual cursor model.

4. Only update the columns that need to change

We want to avoid transferring unnecessary bytes between memory and disk. Only update the columns that need to change. Avoid updating all columns unnecessarily, for example, using "set tab.* = rec.*". Also, avoid updating columns with indexes, as the data has to be written to disk and the index updated, requiring two or more writes to disk.

5. Use "UPDATE WHERE CURRENT OF" syntax

When you are updating a row, you most likely have already selected the row and loaded it in memory. The statement "update table set X=X where rowid = rec.rowid" re-selects the data from disk again before updating it. Use the statement "update where current of" to update a row you have already selected. This will speed up your updates. See the example program at the end of this article.

6. Use of direct updates vs. selecting data for update

There was a very interesting discussion on the Internet in comp.database.informix on the fastest way to update 2 million rows. The general consensus was to turn off logging and let the database engine do the update for the fastest results. I did some tests and this was 5 to 6 times faster than selecting the data and then updating it using the dual cursor method.

7. Using LET vs INITIALIZE

The INITIALIZE statement performs a subroutine which is slower than the LET statement which performs a byte copy. At the beginning of a program, I like to initialize a null record (null_rec.*). Then in the program, use "LET p_rec.* = null_rec.*". This is faster than calling INITIALIZE repeatedly.

8. Using GLOBAL variables vs. passing parameters

Passing variables to functions requires push and pop functions on the 4GL stack. Look at the C code generated by 4GL and examine what happens when you pass a variable to a function. There is some extra overhead in this. I recommend carefully selecting variables that are passed frequently to functions and making them GLOBAL variables. GLOBAL variables are accessed directly in memory.

9. Program initialization functions

Avoid repeatedly opening and closing the same forms, preparing static cursors, and initialing NULL variables throughout a program. At the beginning of a program, have a function to perform all initialization of null variables, opening forms and preparing cursors.

10. Update statistics

The database optimizer needs correct information about what is in your database to perform well. After adding or deleting many rows in a 4GL program, use the statement "UPDATE STATISTICS FOR table_name" to update the information the optimizer uses.

11. Plan your locking method

Row locking allows many people to update data at the same time. However, it can consume system resources and may fail if your system does not have enough locks available. If users are updating single rows, then row level locking works best. However, for batch updates, consider locking the entire table. One rule I have used is if the update will involve more than 200 rows or 25% of the rows, the table needs to be locked.

12. Plan transaction length

Avoid long processes that may fill up your logs and lock the database engine. This is one advantage of the dual cursor model vs. the direct update model. The direct update model is one long transaction and may fill up your logs. Using a cursor and committing after each row is updated will ensure that your logs do not fill up.

Figure 1. Example Program Using the Dual Cursor Model
######################################################################
#           Copyright 1994 Advanced DataTools Corporation
# Function:    	updcursor.4gl
# Author:      	Lester B. Knutsen
# Description: 	A program to demo the dual cursor approach
# Date    	Name 		Comments
#-----    	-----     	----------------
#020194 	LK     		Created
######################################################################

database stores
globals
     define    	price          	like items.total_price,
          	new_price      	like items.total_price,
          	rowno          	integer
end globals

main

## Declare the big cursor selecting only the rowid or a unique indexed field for
all rows required.
prepare pc_select_all from " select rowid from items "
declare c_select_all cursor with hold for pc_select_all
## The with hold option will keep the cursor open after a commit work

## Declare the small update cursor to lock one row at a time - locking the
smallest data possible.
prepare pc_lock from "select total_price from items where rowid = ? for update
of total_price"
declare c_lock cursor for pc_lock 

## Prepare the update statement to update the current row in memory.
prepare pc_upd from "update items set ( total_price ) = ( ? ) where current of
c_lock "

## Fetch the Big cursor.
foreach c_select_all into rowno, price
	 ## On every row - fetch the small cursor.  This will lock and update one row
at a time
     	begin work     ## Used when database logging is turned on 
     	open c_lock using rowno
     	fetch c_lock into price
          	let new_price = price * 1.1
          	execute pc_upd using new_price
          	display "Update from ", price, " to ", new_price
     	commit work  ## Used when database logging is turned on 
end foreach

display "Update complete "
display ""
end main

Lester Knutsen, Advanced DataTools Corporation, 4510 Maxfield Drive, Annandale, VA 22003, Phone: 703-256-0267 or Email: lester@access.digex.net.


Avoiding Database Crashes

by Joe Lumbley

In an environment with multiple programmers who are familiar with Unix, your most common problem will occur when somebody kills an engine process. Informix does not take lightly to the untimely death of a sqlturbo engine. This often causes the database to abort, necessitating a restart and recovery that can take anywhere from minutes to hours.

In some instances, you will have a "rogue" sqlturbo process running that cannot seem to be killed. If a user kills this process at just the wrong time, the entire database could go down. More sophisticated Unix users will often just resort to the "kill -9" in order to try to stop their queries.

This problem can best be solved by enforcing a prohibition that NOBODY but the DBA can "kill -9" a sqlturbo process. It's safe for anyone to "kill -15" a process. The problem here is that if a Unix user gives a "kill -15" command, the process will not necessarily die immediately. If the sqlturbo process goes into a rollback status because it received the command in the middle of a transaction, the process will remain around until the rollback completes. An impatient user may try a few "kill -15's" and finally try for a sure kill with "kill -9". If the system is in rollback, OnLine will probably crash. If the process was holding any locks or if it was in a critical write process, the OnLine can crash. The Informix command "tbmode -z" command works in the same way. The job doesn't immediately disappear.

Both "kill -15" and "tbmode -z" are safe to use any time, as long as the user is prepared to be patient. It's usually best to have the users check with the DBA if the jobs don't die within about 15 minutes.

Processes that take longer than fifteen minutes can pose a quandary even to the DBA. It's possible that the query was a disjoint query and that the system is trying to sort a billion rows for you. You could be in a long transaction on the verge of either going into forced rollback or filling up all your logs, causing a crash. If the DBA can't identify exactly what's happening it is usually best to let the process finish its job, unless this will cause other problems in the database, such as filling up the logs. If the runaway job begins to affect performance or response time the DBA usually needs to manage to kill it.

By the time the DBA sees the typical rogue sqlturbo process, the user has probably already tried to kill it with "kill -15" or "tbmode -z". It's probably in rollback. Rollback can be identified by the flags "--R--X--" in a "tbstat -u" process. Using the Unix "ps" command will tell the DBA whether the sqlturbo process is getting any time. If so, your rollback is in process. No matter what you do, the rollback has to complete. It'll either complete with the system online or it'll complete during the recovery stage of database startup. Online is usually best.

The worst case for the DBA is the process that was written in ESQL/C or that has other reasons why the process only recognizes a "kill -9". Here, if you absolutely have to stop the process, a "kill -9" is your last resort. If your Unix has process control (do you have a "bg" command to put a job in the background), read the next paragraph before doing a "kill -9". Otherwise, do it and cross your fingers.

Some Unix operating systems that support placing jobs in the foreground or background have another option to "kill" that could possibly give you an out in the above situation. If you can either stop the process by process control or with a "kill -STOP" command, you may see that the Unix "ps" command shows that the process eventually is getting no time and is in a sleeping state. You can then use tbstat -u and check the flags for the terrorist process. If the flags don't show an "X" indicating that the process is not in a critical state, you have a chance. If it is in a critical state, use your equivalent of a "kill -CONT" command to crank it up again. Keep stopping and starting it until it gets no time and doesn't have an "X". Note the username column of the tbstat -u output. There is also a column named "locks". If the process is holding no locks, you're probably safe in doing a "kill -9". If it doesn't work and you really needed to kill the process, you haven't lost anything. You would have had to do the "kill -9" anyway.

No matter what you do, no matter how careful you are, you will crash like this sometimes. If the consequences are bad enough, the users usually learn to make the DBA kill the tough jobs.

This is an exerpt from a new book on Database Administration by Joe Lumbley.

The Informix Database Administrator's Survival Guide. ISBN 0-13-124314-1, Informix Press.

Joe Lumbley, 819 N. Bishop Street, Dallas, TX 75208, Phone: 214-450-9896, Email: jlumbley@netcom.com


The PREPARE Statement

by Kevin Fennimore

As defined by the Informix Guide to SQL manual, the prepare statement is used to "parse, validate, and generate an execution plan for SQL statements whose structure is unknown when your program is compiled." The most common use of the prepare statement is for implementing query-by-examples, as in the following segment of code:

		CONSTRUCT BY NAME where_clause ON screen_rec.*
		LET stmt = "SELECT * FROM some_table WHERE ", where_clause
CLIPPED
		PREPARE p1 FROM stmt
		DECLARE c1 CURSOR FOR p1
		FOREACH c1 INTO some_rec.*
			. . .
		END FOREACH

However, there are other uses for the prepare statement, such as significantly increasing performance. To understand how the prepare statement impacts performance, let's take a closer look at what happens when a statement is prepared and when a statement is run in an application.

When a statement is prepared, the following actions are performed by the engine(i.e. the sqlexec or sqlturbo process):

- and -

These actions require reads of the system catalogues( systables, syscolumns, sysindexes, etc. ). The syntax checks require the reading of catalogues like systables and syscolumns to check for the existence of specified table and column names. The permission checks require reading the sysuers, systabauth and syscolauth tables. The optimization requires the reading of catalogues such as sysindexes and systables. This is just an example of some of the catalogues involved in preparing a statement but there are many others.

When a statement is run in any Informix application it must be
prepared and then executed.  For example, the following
statement:
	INSERT INTO some_table VALUES( p1, p2, p3 )
when run in 4GL or an ESQL program, actually translates to a
prepare statement and an execute statement.  The above insert
statement is the same as the following statements:
	PREPARE p1 FROM "INSERT INTO some_table VALUES( ?, ?, ? )"
	EXECUTE p1 USING p1, p2, p3

Let's consider an example of an insert statement inside a loop:

	LOOP 1000 TIMES
		INSERT INTO some_table VALUES( p1, p2, p3 )
	END LOOP

In this example the statement is prepared 1000 times and then executed 1000 times. Recall from above, that when a statement is prepared, it must be checked for syntax and permissions and the statement is optimized. All of which requires the overhead of reading the system catalogues and additional processing time. However, the statement only needs to be prepared once which means that the above example is incurring a large amount of overhead by preparing the statement an additional 999 times. A better way to perform this insert is as follows:

	PREPARE p1 FROM "INSERT INTO some_table VALUES( ?, ?, ? )"
	LOOP 1000 TIMES
		EXECUTE p1 USING p1, p2, p3
	END LOOP

This example prepares the statement once and then executes it 1000 times removing the overhead of unnecessarily preparing the statement 999 times.

This concept can be carried over into other statements as well, such as select and update statements. Consider the following function which validates a state:

		FUNCTION valid_state( entered_state )
			DEFINE entered_state CHAR(2)
			DEFINE not_used CHAR(2)
			
			SELECT state INTO not_used FROM state_table
				WHERE state = entered_state
			IF( STATUS = NOTFOUND )
			THEN
				RETURN FALSE
			ELSE
				RETURN TRUE
			END IF
		END FUNCTION

If this function was called in an after field clause of an input statement the user might notice a temporary pause in the movement of the cursor to the next field. This could be due to the system performance in searching the state table. More likely, however, there will be a pause while the program prepares the statement and then executes the select. A more optimal way of writing this function would be:

		FUNCTION init_valid_state()
			DEFINE stmt CHAR(50)
			LET stmt="SELECT state FROM state_table WHERE state =
?"
			PREPARE p_valid_state FROM stmt
			DECLARE c_valid_state CURSOR FOR p_valid_state
		END FUNCTION
		
		FUNCTION valid_state( entered_state )
			DEFINE entered_state CHAR(2)
			DEFINE not_used CHAR(2)
			
			OPEN c_valid_state USING entered_state
			FETCH NEXT c_valid_state INTO not_used
			IF( STATUS = NOTFOUND )
			THEN
				RETURN FALSE
			ELSE
				RETURN TRUE
			END IF
		END FUNCTION

The function init_valid_state() would be called at the beginning of the program to prepare the select statement and declare a cursor for it. Then when the valid_state() function is called, it simply opens the cursor and fetches the value. This avoids the need to prepare the cursor each time the function is called. There are drawbacks to preparing too many statements. One is that the select is prepared every time the program is run, even if the valid_state() function is never called. Also, if there are several statements being prepared at the beginning of the program it will take longer to start up. There is also a limit to the number of statements that can be prepared at any given time; version 4.01 has a limit of 256 and earlier versions have a limit of 64. However, statements can be free'ed to release their resources and allow other statements to be prepared.

The most common question about prepares is "How many times does a statement have to be run before it should be prepared?" There is no definitive answer to this question. The thing to keep in mind is that any statement that is run will be prepared and then executed. This means that preparing a statement that is only executed one time doesn't hurt performance. When using the Standard-Engine, preparing statements will show a greater performance improvement over those same statements in an OnLine environment. This is due to the fact that the Standard-Engine does not have the buffering capabilities of OnLine. Typically, OnLine will have most of the system catalogs buffered in shared memory so the preparation of a statement will not require as many physical disk reads. However, the preparation will still require processing time which can still be costly.

Preparing statements in any application is always a good idea. Overall performance is increased when a statement that is run more than once is prepared once and then executed many times. Obviously, the more a statement is executed, the more noticeable the increase in performance will be when it is prepared. The practice of preparing statements whenever possible will enhance performance before performance becomes a problem. Remember, it is always easier to take out a few unnecessary prepare statements than to go back and add the code to prepare 256 statements.

Kevin Fennimore, UCI Consulting, 2413 Arbor Lane, Hillsborough, NC 27278, Phone: 919-644-1073.


An Introduction to INFORMIX Database Security

by Lester Knutsen

Informix applies security at the database, table and column level. All security is based on a users login name. Using Informix SQL, the creator of a database can give and remove privileges. The SQL command to give access is the grant command and the revoke command removes privileges.

Database Privileges

There are three levels of database privileges: DBA, Resource and Connect. Connect privilege allows a user to access the database and add, modify and delete data. Resource privilege allows a user all Connect privileges, and in addition, the ability to create new tables, indexes and procedures. DBA privilege allows connect and resource privileges, and also gives the user the authority to grant resource, connect or DBA to another user. A user with DBA privilege can also drop any object in the database, and/or drop the database entirely.

Connect privilege is sufficient for most database activity. A programmer or developer who is creating new objects may require Resource privilege. DBA privilege is like the "super-user" of the Unix Operating System. Only a few selected users should be granted DBA privileges.

Table Privileges

There are six basic table privileges and a seventh privilege that was added with Informix release 5.0. The privileges are:

		Select	- view data in a table
		Insert 	- add new rows of data to a table
		Update- change existing rows of data in a table
		Delete	- remove rows of data from a table	
		Index	- add indexes to a table
			A user must also have the database privilege of
resource to index a permanent table. 
		Alter	- modify the structure of a table.
		Reference - ability to reference columns in referential
constraints
				This is a new feature with Release 5.0.

Column Privileges

There are two column level privileges - select and update. In order for column level privileges to be effective, the user should not have select or update privileges at the table level. If the user has table level select or update privileges, these will override the column level privileges. Column level privileges limit what a user can see or change to specified columns of a record in a table.

Public Privileges

Informix uses the keyword public to identify privileges that apply to all users. To allow all users on a system to access a database, the database privilege Connect would be granted to public. To allow all users with access to a database to select data from a column, select privileges would be granted to public. The public keyword is a very powerful tool to allow open access to your data. By the same token, to prevent all users not specifically authorized from connecting to a database and selecting data, public privileges can be revoked entirely.

Owner Privileges

The Owner of an object (database, table, etc.) in a database has special privileges over that object. The owner has all seven table level privileges. These cannot be revoked from the user who created an object. This can create problems when a system moves from development to production, or when the original owner of a table or database leaves the organization. This privilege is also not easily identified. When you use the Informix SQL "info" command to see table privileges, owners' privileges are not identified. Another user, even a DBA, cannot revoke privileges from an owner. This requires that when a database or table is created, the owner must be carefully decided. The owner of a table is also the only one who can grant other users privileges for that table.

Informix Defaults

When an Informix database is created, the only database level privilege granted is DBA privilege to the creator. When a table is created, by default public is granted select, update, insert and delete privileges on that table. This means that unless the privileges are changed, any person with database access can view and modify a table.

Database Security Model

Establishing database security takes planning. For best results, start when the database is created. A production environment is different from an environment where programs and the database are still under development. To facilitate database security, two environments are needed - one for the development of software, and one for the production system. Another key component of an effective database security model is definition and assignment of the roles of Database Administrator, and Database Security Officer. These roles may be combined and handled by one individual, or user login.

The Database Administrator

It is strongly recommended that an abstract user login be created as the Database Administrator (DBA). This is essentially a super-user for a database. Establishing a separate login minimizes the risk of a user accidently dropping a database they have created. This is especially important in a production environment. The DBA login name should be the creator of the database and all tables in the database. The DBA should be the only user with alter privileges on any table in the database. As the owner of the database and all tables within the database, the DBA can grant privileges to other users.

Lester Knutsen, Advanced DataTools Corporation, 4510 Maxfield Drive, Annandale, VA 22003, Phone: 703-256-0267 or Email: lester@access.digex.net.


Washington Area Informix User Group Sponsorship

The following options are available for companies who would like to participate in our activities:

Presentation at Meetings

The user group meets every other month. We would like to have one presentation per meeting from vendors that have products that work with Informix. If you would like to make a presentation at a meeting please contact Lester Knutsen at 703-256-0267.

Newsletter Sponsorship

The newsletter is produced quarterly. Each mailing goes to over 500 users in the Washington area. Companies sponsoring the newsletter may place a one page ad. If you would like to sponsor an issue, please contact Lester Knutsen at 703-256-0267.

Forum 1994 - An Informix User Group One Day Event

We are planning a one day user group forum in November. John Petruzzi is the coordinator for this event. If you would be interested in demonstrating your products, speaking or conducting a seminar please contact John Petruzzi at 703-490-4598.


This Newsletter is published by the Washington Area Informix Users Group.

Lester Knutsen, President/Editor

Washington Area Informix Users Group

4216 Evergreen Lane, Suite 136, Annandale, VA 22003

Phone: 703-256-0267

lester@access.digex.net