|
A Real-Time DSP Environment Kernel Modules System Performance TI API Specifications FAQ's Tech Support Request Form Press Release |
|
|
The remainder of this section outlines the functionality of each DSP/BIOS kernel module, while presenting some insight into its typical use within a target application. A complete listing of all DSP/BIOS kernel functions appears at the end of this document. LOG - Event Log Manager This module manages kernel log objects used to capture both application- and system-level events that occur during the course of target program execution. The LOG module exports a basic set of run-time functions for appending designated log objects with fixed-length messages corresponding to program events. Messages held in these logs can subsequently be uploaded and displayed in real-time through a hosted visual analysis utility. The BIOSuite configuration tool supports creation of multiple log objects, each serving a distinct purpose within the target program. Attributes such as the length and location of a log's internal message buffer are statically assigned using the configuration tool, allowing developers to make appropriate tradeoffs in the amount of memory dedicated to event logging versus the number of program events that must be retained. Each log object carries an additional attribute which pre-defines one of two distinct modalities of operation as its buffer saturates:
The events themselves each consume four words of message storage within the log's internal buffer, where the first word of the message holds a unique (16-bit) sequence number used to collate log contents on the host. The remaining three words of this internal message structure represent event-dependent codes and data values supplied as parameters to client functions such as LOG_event, used to append new events to a log object. Though held in memory and uploaded through the real-time host/target link as 16-bit binary values, event messages are ultimately rendered in a user-specified textual format within a scrolling console window on the host. As a further convenience for C as well as assembly language programmers, the LOG module supports a rather unique variation of the ever-popular printf. Called LOG_printf, this kernel function accepts a format string plus two data values as parameters and writes a corresponding event message to the designated log object, but without formatting the data. Substituting a special event code for the client-supplied format string (which need not even reside in target memory), a BIOSuite host utility eventually uses this code to retrieve the original string when formatting the adjoining pair of data values for console output. Because of this calculated partitioning of responsibility between target and host, DSP/BIOS API calls to LOG_event and LOG_printf return in well under 1 µs on 16-bit TMS320 DSPs and hence prove suitable for embedding most anywhere in the program source - even the most time-critical interrupt routines coded in optimized assembly language. More than transient scaffolding for analyzing subtle timing-related problems arising during software development, DSP/BIOS event logging support can will remain in place within production systems to support application diagnostic procedures for the field or factory. STS - Statistics Accumulator Manager This module manages a class of kernel objects termed statistics accumulators, used to amass summary information about different time-varying elements of the target application program. The STS module exports a basic set of run-time functions which, using a designated accumulator object, maintain several key statistics computed from a series of data values supplied by the target application during the course of execution. A BIOSuite host utility periodically uploads and displays these program statistics in real-time. Like other kernel objects, individual statistics accumulators are statically created and initialized during program generation using the BIOSuite configuration tool. Consuming only a small amount of target memory, each of these objects internally maintains three 32-bit variables used to accumulate the following statistics with less than .5 µs (microseconds) of overhead:
DSP/BIOS statistics accumulators also prove useful for tracking the absolute CPU utilization of distinct program threads over the course of execution, given a means to count instruction cycles or measure the passage of time with sufficient precision. Using the functions STS_set and STS_delta in tandem on a common accumulator object, developers can automatically gather real-time performance statistics about different portions of their application by simply bracketing the appropriate sections of the program source with these kernel API calls. Depending upon its intended use, a statistics accumulator's internal 32-bit variables do run the risk of overflow if updated thousands-of-times per second. For this reason, the BIOSuite analysis utility uploads and resets each target accumulator at an adjustable periodic rate (typically once or twice a second); the utility then folds these raw target statistics into a corresponding trio of extended-precision variables maintained on the host. A rich set of options are furnished for visually formatting the basic accumulated statistics as well as derived values such as average. A simple yet flexible mechanism, DSP/BIOS statistics accumulators nicely complement the explicit yet finite retention of discrete program events afforded by more memory-intensive kernel log objects. Offering a means to monitor program activity from an adjoining host with little time or space impact upon the target, statistics accumulators are equally as likely to remain embedded within production application programs as are other DSP/BIOS objects. CLK - Clock Manager This module implements a pair of logical 32-bit real-time clocks used to measure the passage of time in conjunction with STS accumulator objects as well as to generate special time stamp messages for system event logs. Utilizing the on-chip timer hardware present in most TMS320 DSPs, the CLK module supports time resolutions down to the single instruction cycle. This module can also serve as a default "heartbeat" for driving the execution of periodic program functions, as described later in the context of the PRD module. ![]() Figure 2. Relationship of timer apparatus and CLK module Figure 2 depicts the interrelationship between the underlying timer apparatus and the CLK module's own internal mechanisms for synthesizing a high- and low-resolution real-time clock. Beginning with the on-chip timer, CLK automatically initializes its control and period registers based upon parameters (conveniently) assigned using the BIOSuite configuration tool. Though capable of counting with single-cycle precision, its 16-bit period register - barely spanning an interval of 1 ms on today's 40 MIPS devices - limits the timer's capacity as a real-time clock. To extend the timer's range, CLK relies upon an interrupt generated at the end of each period to increment an internal 32-bit counter. As illustrated above, the function CLK_gethtime uses this variable in combination with the timer's own hardware counter to synthesize a true high-resolution 32-bit time value, offering immediate support for performance measurement in conjunction with DSP/BIOS statistics accumulators. Since the duration of the timer period can be configured with any 16-bit value, CLK_gethtime performs the necessary calculations to scale the module's internal counter variable while guarding against timer interrupt race conditions - all with less than .5 µs (microseconds) of overhead per function call. For applications requiring a real-time clock with less precision but greater range, the function CLK_getltime offers a controlled interface to the module's internal 32-bit interrupt counter. With timer interrupts typically occurring at frequencies of around 1 kHz, the low-resolution clock represented by this counter is particularly useful for generating time stamps in applications that employ DSP/BIOS logs to capture program events over extended periods of time. SIG - Software Signal Manager This module manages a class of objects termed software signals that lay the foundation for structuring DSP/BIOS applications around a prioritized hierarchy of real-time threads. Patterned after interrupt service routines, the SIG module internally schedules execution of a corresponding handler function (written in C or assembler) in response to client programs triggering individual signal objects through kernel API calls. Once triggered, execution of a signal handler will strictly preempt any current background activity within the program as well as any signals of lower priority; interrupt routines on the other hand take precedence over software signals and remain enabled during execution of all handlers, allowing timely response to hardware peripherals within the target system. As Figure 3 portrays, DSP/BIOS software signals enrich familiar embedded designs limited to only two classes of program threads - high-priority interrupt routines and low-priority background functions - by introducing an additional band of prioritized foreground threads that dovetails nicely between the extremities. ![]() Figure 3. DSP/BIOS Software signals add new dimension to embedded design. Statically defined using the BIOSuite configuration tool, the attributes of individual signal objects include a handler function address f, a pair of handler function arguments x and y, and a execution priority p ranging from 1 to 15 (with background functions and interrupt routines conceptually assigned to execution levels 0 and 16 respectively). Operationally, triggering a signal causes a C-style invocation of the form (*f)(x,y) to occur at some point forward in time, depending upon the value of the priority p relative to other ongoing and pending program activities. Multiple signal objects assigned at the same priority level will execute in a first-come, first-served manner once posted. Individual signals are triggered for execution through calls to the kernel function SIG_post, which can be embedded in virtually any thread within the target program - interrupt service routines, background functions, or other signal handlers. Depending upon the thread's current level of execution, calls to SIG_post either:
![]() Figure 4. SIG module dispatches handlers during course of execution. Figure 4 illustrates how the SIG module dispatches handlers during the course of execution, consider a common application scenario in which an interrupt routine posts a signal whose priority k is higher than the program's current execution level j at the time the interrupt had occurred. The background function or low-priority signal handler executing in the first stage of this example is asynchronously preempted by an incoming hardware interrupt, triggering a corresponding interrupt service routine and implicitly raising the program priority to its maximum value (level 16). The interrupt routine executing in the next stage then posts a foreground signal that will run at priority level k, though execution of the signal handler is deferred until after the interrupt service routine completes and incoming hardware interrupts are re-enabled. The SIG module takes control during the transition into the third stage of this example and performs a context switch, automatically saving the processor's register file onto the program stack before dispatching any posted signal handlers in strict priority order. Like interrupt service routines, DSP/BIOS signal handlers are program functions that run to completion and, upon return, allow SIG to dispatch additional handlers of equal or lower priority; with no further signals pending in this example, the processor context is automatically restored as program execution returns in the final stage to its original state. Both the time and space overhead introduced by the SIG module are remarkably low, enabling broad deployment of DSP/BIOS signal objects within embedded DSP applications. From the standpoint of memory, SIG statically allocates only 10 words of internal storage for each signal object configured within the program, encouraging a "signal-per-channel" design methodology in high-density telecommunication systems as an example. Even more significant, since signal handlers execute to completion and return when dispatched, the DSP/BIOS kernel requires no more than the single software stack already allotted for program run-time support. Looking at execution overhead that can potentially impair a program's ability to respond to real-time events within prescribed deadlines, a key CPU benchmark characterizing any preemptive multi-tasking kernel is an asynchronous context switch - the time required to save/restore all processor registers and essentially effect the transitions between steps 2-4 in the earlier example. On today's 40 MIPS TMS320 devices the DSP/BIOS kernel overhead is under 6 µs (microseconds) for each interrupt routine that calls SIG_post, suggesting minuscule impact within applications whose processing cycles fall under the 1 kHz threshold. With low overhead per context switch, DSP/BIOS signal objects also become an ideal mechanism for structuring programs that employ multiple algorithms executing at different rates - for example, a telecommunication application where voice coding, tone detection, and echo cancellation typically process a common 8 kHz input stream using frames of differing duration that might range from 1-20 ms (milliseconds). Since signal handlers will preempt one another on a strict priority basis, binding algorithms with shorter deadlines to higher priority signals - a technique known as rate monotonic scheduling - ensures orderly interleaving of otherwise independent real-time threads that each contend for their respective allotment of processor cycles. HWI - Hardware Interrupt Manager This module manages a finite class of objects corresponding to individual hardware interrupts recognized by the underlying DSP platform. The run-time support provided by the HWI module enables interrupt service routines associated with these objects to schedule execution of DSP/BIOS software signals in the manner illustrated earlier. Of necessity, program interrupt routines that directly or indirectly invoke SIG_post (or one of its variants) must begin and end with embedded API calls to HWI_enter and HWI_exit, giving the kernel momentary control at these critical junctures during execution. Operationally, the function HWI_enter effectively raises program execution priority to its maximum level for the duration of the interrupt service routine. The function HWI_exit restores execution priority to its former level upon returning from the interrupt routine and, if signal objects of higher priority had been posted, transfers control to SIG which in turn performs a context switch prior to dispatching any pending signals. Unlike other kernel functions, HWI_enter and HWI_exit are not accessible through the C-callable DSP/BIOS API under the assumption that interrupt routines (or at least their prologues and epilogues) are invariably coded in assembly-language. This pair of services in fact are implemented as macros that expand in-line to less than 10 instructions in total. IDL - Idle Loop Manager This module manages a class of objects representing low-priority, background threads that execute when no other program signal handlers or interrupt routines are active. The IDL module controls execution of all background threads within the target application program through a centralized idle loop inside the DSP/BIOS kernel. Not unlike signals, individual background threads are statically created using the BIOSuite configuration tool and include among their attributes a program function address f along with a pair of function arguments x and y. Back-ground threads cannot be prioritized, however, and conceptually execute in a round-robin fashion at a single priority level below that of any signal. The IDL module by convention assumes control once the target program returns from its main function, the latter customarily used for system initialization in DSP/BIOS applications. After enabling hardware interrupts (thus commencing the real-time activation of interrupt routines and signal handlers), IDL will meanwhile loop through an internal list of all background threads and call each of the pre-configured functions. PRD - Periodic Function Manager This module manages a class of objects representing individual C or assembly-language program functions that execute periodically with respect to an underlying time base. Activation of the PRD module is driven by regular calls to the kernel function PRD_tick which in turn schedules execution of individual periodic functions. Statically defined like other program threads, the attributes of individual PRD objects include a function f, arguments x and y, and an integral rate r. The PRD module internally invokes the program function (*f)(x,y) once every r calls to the kernel function PRD_tick. For convenience, the latter function can be implicitly invoked by the timer interrupt routine internally controlled by the CLK module if so configured; or, the target program can explicitly call the kernel function PRD_tick from any periodic interrupt routine already present within the target system, triggered (say) by peripherals that produce or consume continuous streams of data. Internally, the PRD module dispatches and executes all periodic functions within the context of a special signal object automatically created by the BIOSuite configuration tool and implicitly posted through calls to PRD_tick. Following the tenets of rate monotonic scheduling, this special signal should be assigned an execution priority reflective of the underlying PRD_tick rate relative to other program deadlines. In many applications featuring multi-rate algorithms the PRD signal in fact executes at the highest priority, with individual periodic functions pre-configured with stock calls to SIG_post that in turn trigger other signals of lower priority to subsequently execute. ![]() Figure 5. Typical telecommunications example of PRD module. In this particular example - typical of telecommunications - we presume signals s3, s2, and s1 process a common 8 kHz input stream by applying distinct algorithms to data frames comprising 40, 96, and 160 points respectively. With a fundamental PRD_tick rate of 8 kHz provided by the input stream itself, the periodic rate r assigned to each PRD object in fact mirrors that number of input points processed by the target signal. (Though PRD_tick is called every 125 µs, static analysis by the BIOSuite configuration tool divides-down each of the assigned periodic rates by a factor of 8 in the generated kernel data structures, effectively causing the PRD signal to execute at a slower frequency in the target system.) Besides serving as the backbone for all cyclic processing within an embedded DSP application, the mechanisms internal to PRD can themselves be used to synthesize an alternative real-time clock to that furnished by the CLK module. Not unlike CLK_getltime, the module function PRD_getticks provides a controlled interface to an internal 32-bit counter incremented each time the target program calls PRD_tick. With the latter function invoked at 125 µs intervals in the prior example, the derivative real-time clock provides a convenient time base for measuring the latency of individual algorithms with sufficient precision yet minimal overhead. TRC - Trace Control Manager This module manages a finite set of trace control objects, used to selectively trigger real-time capture of program information through DSP/BIOS event logs and statistics accumulators. The TRC module exports a basic set of functions for enabling or disabling individual trace controls as well as querying the state of these objects. These functions can be directly called by the target program through the kernel API or else interactively invoked using a hosted BIOSuite utility. The need for TRC arises from the fact that several kernel modules - notably SIG though others as well - can automatically log events and accumulate statistics reflecting the behavior of individual program objects at run-time. The SIG module, as an example, maintains a private statistics accumulator for each signal object configured within the system, and implicitly uses this accumulator to tally performance information whenever individual signal handlers are dispatched. SIG also captures a history of overall program flow by writing special events to a distinguished system log each time a signal object is posted, dispatched, and terminated. Table I summarizes the manner in which LOG and STS are internally utilized by other kernel modules as part of an overall strategy for automatically instrumenting target application programs otherwise structured around DSP/BIOS run-time services.
With so many modes of automatic instrumentation supported by the DSP/BIOS kernel, it becomes critical for the target application program to maintain precise control over which forms of data capture are active at any point in time. Recognizing that event logging and statistics accumulation do incur some overhead at run-time (albeit small), developers will want to tradeoff the degree of intrusiveness these mechanisms introduce into their programs against the amount of useful information these mechanisms yield about the program during its course of execution. Calling the functions TRC_enable and TRC_disable in conjunction with a handful of built-in trace control objects corresponding to the entries in this table, the target application can independently start and stop each mode of automatic instrumentation supported within the DSP/BIOS kernel. This level of control enables target software to explicitly initiate and terminate different classes of event logging or statistics accumulation in response to exceptional conditions encountered during the course of execution - effectively implementing a forward or backward program trace triggered by the application itself, appealing to our earlier metaphor of the logic analyzer. Through the BIOSuite configuration tool, developers can define a limited number of additional trace control objects that track their own use of event logs or statistics accumulators. In general, application-level API calls to LOG or STS functions can be dynamically enabled or disabled with minimal run-time overhead if invoked conditionally using status returned by TRC_query, as illustrated by the fragments of C code. if (TRC_query(myLogCtrl) != 0) { /* log an event */
LOG_printf(myLogObj, `...`);
}
if (TRC_query(myPerfCtrl != 0) { /* measure performance */
STS_delta(myPerfObj, CLK_gethtime());
}
As noted earlier, trace control objects (built-in or user-defined) can be independently
enabled and disabled from a host analysis utility as well as the target program. This
affords developers with the opportunity to control system- or application-level capture on
a more ad hoc basis. PIP - Data Pipe Manager This module manages a class of kernel objects termed data pipes, used to buffer streams of program input and output typically processed by embedded DSP applications. DSP/BIOS data pipes represent a common software building-block required for driving the sorts of real-time I/O peripherals deployed in DSP systems, ranging from synchronous devices like analog codecs and TDM highways which continuously produce and consume data samples to asynchronous devices like UARTs and host ports which are more bursty in nature. ![]() Figure 6. BIOSuite configuration tool. Statically defined using the BIOSuite configuration tool, each pipe object manages an internal circular list comprising n data frames of length k that conceptually flow from a writer at the pipe's tail to a reader at its head, and then back again. Pipe objects can optionally be configured to notify the reader or writer whenever full or empty frames become available, as Figure 6 illustrates. To write to the tail end of the pipe, the program initially calls the function PIP_alloc to retrieve a descriptor <p, k> from the internal list structure, where p points to the next available empty frame and k represents its statically configured length. After writing up to k words into the frame, the program enqueues the new data frame at the tail of the pipe by calling the function PIP_put with a modified descriptor <p, k´> where k´ is less than or equal to k. This particular descriptor will be subsequently dequeued from the head of the pipe when the function PIP_get is called, at which time the program reads the k words of data and recycles the frame by calling PIP_free. Each pipe object maintains two internal counters at each end used to track the number of unread and unwritten data frames and to synchronize the reader and writer accordingly. Each invocation of PIP_put or PIP_free increments the counter at the opposite end of the pipe and, if configured, will execute a distinct callback function to notify the reader or writer thread that PIP_get or PIP_alloc can be safely invoked; the latter functions in turn decrement these internal counters. Alternatively, the reader (writer) thread can operate in a polled mode and directly test the counter maintained at the head (tail) of the pipe before retrieving the next full (empty) frame. When used to buffer real-time I/O streams produced or consumed by a hardware peripheral, pipe objects often serve as a conduit between the interrupt routine triggered by the peripheral itself and the program function that ultimately reads or writes the data in the context of a DSP/BIOS signal. Unless already scheduled to run on a periodic basis through some other means (such as PRD), the application can effectively synchronize itself with the underlying I/O stream by configuring the pipe's callback mechanism to automatically post a particular signal object whenever the interrupt routine calls PIP_put or PIP_free; the application can also use the callback mechanism in the opposite direction to initiate I/O whenever the device is found idle. Besides serving as a key building block for managing hardware I/O devices, DSP/BIOS pipe objects can perform double duty as a general message-passing facility between different program functions. With flow-control implemented through bi-directional callbacks and data copying left to the discretion of their clients, pipes become an elegant yet efficient mechanism for intra-thread communication within embedded DSP applications; and with connections to the appropriate data links in the target system, pipes can implement inter-processor communication as well. HST - Host Channel Manager This module manages a class of kernel objects termed host channels, used for streaming data to or from the host environment via an underlying real-time link during the course of target execution. While the HST module requires that each channel's directionality be statically configured as source or sink - meaning, data originates or terminates on the host - developers can dynamically bind these objects to host operating system files and selectively enable channel data flow using a BIOSuite analysis utility. Host channel objects also provide the key mechanism for automatically capturing the contents of designated target I/O streams in real-time. The HST module implements each host channel using a standard DSP/BIOS data pipe object, exposing either the head or tail end of the pipe depending on whether that channel has been configured as a source or sink. To read or write a particular channel, the program locates the corresponding pipe object and calls the functions PIP_get / PIP_free or PIP_alloc / PIP_put. While HST takes over one of the pipe's callback functions for internal purposes, the application can use the other callback to notify the client thread when the channel can be read or written; other pipe attributes (location, number, and length of internal data frames) are assigned values in the course of defining host channel objects using the BIOSuite configuration tool. ![]() Figure 7. DSP/BIOS host channels within a target program. To fully appreciate the power of the HST module, Figure 7 illustrates distinct roles potentially played by DSP/BIOS host channels within a target program at different stages of the application life-cycle. During early development - especially when testing signal processing algorithms - the program would explicitly use source channels to access canned data sets for input to the algorithm, and would use sink channels to record algorithm output for later comparison with expected results. Besides enabling errors to be faithfully reproduced during testing, the developer can control program execution on an instruction-by-instruction basis with a standard debugger since the target need not operate in real-time under this first scenario. Once the algorithm appears sound, the developer would replace these host channel objects with I/O drivers for production hardware built around DSP/BIOS pipe objects. This feature proves especially potent once the target system reaches the field, where irreproducible program errors often arise from out-of-spec customer premise equipment that in turn cause glitches in the application data streams. The HST module has been designed to control a variety of real-time communication links between the host and target, ranging from a lower-speed physical JTAG connection already used for debugging purposes to an alternate set of faster serial or parallel interfaces. While the amount of usable bandwidth for real-time capture of target data streams ultimately depends upon the choice of physical data link, the use of HST probe channels are independent of the physical link and will in fact scale nicely in the face of increased bandwidth brought about by changes in the underlying platform. |