6.12 Server Push6.12.1 ProblemYou want to implement a code-based server-push replication mechanism with your SQL database server to notify client applications that an event has occurred. For example, you may wish to update a help-desk supervisor's screen each time a new help-desk call arrives. Your goal is to have an external system notified of significant activities or events as they occur. 6.12.2 SolutionThis solution shows you how to simulate a server-push mechanism by repeatedly polling from the client at regular intervals. To start with, you need to have a table on the server to which you can log the activities that you want pushed. The following Event table will be used in this recipe: CREATE TABLE Event ( EventId INTEGER IDENTITY, Code INTEGER, EventTime DATETIME, PRIMARY KEY(EventId) ) This Event table logs events that are generated within the system. Different event codes would be used for different types of events. For each logged event, a timestamp is also recorded in the Event table. The event-log entries are generated by the software operating on the database, perhaps through the use of a stored procedure. See Section 6.10 recipe for an example. The idea in this recipe is to simulate server push with a client-pull mechanism. To do this, you have to create an external program that repeats one query indefinitely. This query repeatedly polls the database for new log entries, and the client program can then initiate appropriate actions in response to those entries. The following pseudocode illustrates the process that must occur in the client software: init @mark REPEAT SELECT Code,EventId FROM event WHERE @mark < EventId ORDER BY EventId store read rows into internal structure FOR EACH row in the structure initiate the action for the event @mark:=id END FOR UNTIL forever The variable @mark in this example is an internal client variable that holds the ID of the log entry most recently read from the event table. Each time a log record is read, the @mark variable is updated. The polling query uses @mark to ensure that only new log entries are retrieved. Only log entries with ID numbers greater than @mark are retrieved. 6.12.3 DiscussionThe solution opens up a wide variety of possibilities and allows programmers to give an active role to their database systems. Rather than being limited to serving requests as they come in from clients, the database can now be the initiator of an event or of a process. Newer database systems have built-in support for both push and pull replication mechanisms; however, these mechanisms are usually not portable across vendor platforms. The solution presented here is portable and provides the flexibility of integrating SQL-based systems with external systems. A stored database procedure can now initiate an action in a system external to the database. You may, for example, use a stored procedure to trigger pager notification to the DBA or system operator when a significant event occurs. 6.12.3.1 TicketingHere ticketing refers to the practice of assigning unique identifiers to each request made by the user of a system. A help-desk system, for example, might create a trouble ticket each time a user calls with a problem, and each trouble ticket would have a unique number to identify it. Proper use of the @mark variable is critical when implementing polling code such as that shown in this recipe. First, you need to make sure that @mark is properly initialized. You need to decide what records you want the client to pull after it is started. Do you want the client to pull all log records that are available? Do you want to pull only those records logged after the client starts? Do you want the client to catch up with any records that might have been missed when the client was not running? Initializing @mark to 0 each time you start the client will result in the client reading all existing log records each time it starts. It's unlikely that you'll want that behavior. A more reasonable approach is to have the client poll for new events that occur after the client starts. You can do that by using a query to grab the highest log ID number from the event table, such as the following: SELECT @mark=MAX(EventId) FROM Event By initializing @mark like this each time you start your client, you ensure that you will see only new log entries. That's because those new entries will have IDs greater than the maximum value at the time the client starts. If it is important that your client program processes all event records, even those posted during periods when the client wasn't running, you can add a column to the event table that allows you to flag rows that have been processed. Then, you can poll for those rows that haven't yet been processed. In such cases, you have to initialize @mark variable to the smallest Id that still hasn't been sent: SELECT @mark=MIN(EventId) FROM Event WHERE Sent=FALSE Assuming that your flag column is named Sent, and that it is a Boolean column, the following pseudocode illustrates the process to follow: init @mark REPEAT SELECT Code,EventId FROM Event WHERE @mark < EventId and Sent=FALSE ORDER BY EventId store read rows into internal structure FOR EACH row in the structure BEGIN TRAN initiate the action for the event UPDATE Event SET Sent=TRUE WHERE Id=@id @mark:=@id END TRAN END FOR UNTIL forever Newly inserted rows will have the Sent column set to FALSE because they have not yet been processed by the client. Your client query can key off that to ensure that it only brings back unprocessed rows. You can also use the sent column to your advantage when you initialize the @mark variable. You just set @mark to the ID number of the earliest log record that hasn't yet been processed. For example: SELECT @mark=MIN(EventId) FROM Event WHERE Sent=FALSE The solution shown here ensures that each event is pushed only once. However, it does require additional UPDATE statements, which result in the use of server-side processing capacity. As with anything else, you have to look at the tradeoffs and decide if the technique makes sense for a given situation. 6.12.3.2 Polling as an alternative to triggersThis solution can also be viewed as an alternative to database triggers. Many database systems support triggers, but when built-in triggers are used, an action is initiated every time a row is added, deleted, or modified. Depending on what the trigger does, this can place a significant burden on the server, especially in times of peak activity. The time required for trigger execution can also add significantly to the time needed to execute a DDL statement. Rather than use triggers, a client program can be configured to poll the server once per second (or some other acceptable time period) for a list of changes that have occurred since the previous poll. The client can then initiate appropriate actions. This results in a slight delay between the time a change is made and the proper events are triggered, but it also moves much of the work into the background where it won't affect the performance of the DML statements making the change. Let's say that we have a small ticketing server, which stores quotes from a stock exchange. We want to use a server-push mechanism to send quotes to the user's screen, but if we use internal triggering, every new quote will end up being sent to, and displayed on, the user's screen. If activity peaks, and the volume of quotes reaches into the hundreds per second, the user will be overwhelmed with information. A better solution might be to have the client polled once per second and only retrieve the most recent quote for each ticker symbol. A query, such as the following, might be used for this purpose: SELECT Symbol, MAX(CASE WHEN EventId=MAX(EventId) THEN Quote ELSE NULL END) quote FROM Quote WHERE @mark<EventId GROUP BY Symbol The query returns the last known quote for each symbol so long as that quote has been logged with an ID greater than that held in the @mark variable. The use of @mark saves us from having to retrieve the same quote more than once. Only new quotes since the previous poll need be retrieved. 6.12.3.3 Client stabilityThe polling solution described in this recipe brings additional flexibility to the table, but it also carries with it some potential reliability problems. When you separate the triggering mechanism from the server, you introduce another potential point of failure. If the client software fails, or if the connection between the client and the server is lost, you lose the whole push mechanism. Think carefully about whether it pays off to introduce such a solution or not. Sometimes there is no other choice, and sometimes it is simply handier than other possible solutions. We can not overstress the importance of having proper error checking embedded in the code of the client. You may also want to implement a secondary service that periodically pings the client to check on performance and availability. 6.12.3.4 Support for multiple tablesAll the code in this recipe has dealt with polling only one event table. You can easily add support for multiple tables by extending the same logic to those tables. You do not even need to add additional clients to poll more tables. One client can be written that periodically executes several SELECT queries against several different event tables. You do, however, need to guard against asking one client to handle too much work. |