|
A Real-Time DSP Environment Kernel Modules System Performance TI API Specifications FAQ's Tech Support Request Form Press Release |
|
To quantify its time/space overhead, consider a simple and somewhat generic application
of the DSP/BIOS kernel within a basic telecommunications system that transforms an 8 kHz
input (voice) stream into a 1 kHz output (data) stream using a 8:1 compression algorithm
operating on 64-point data frames. Figure 8 illustrates the high-level organization of the
target application program around DSP/BIOS kernel objects.![]() Figure 8. Basic telecommunications example transforming a voice stream into a data stream. The compression algorithm at the heart of this particular example executes once every 8 ms within the context of a single DSP/BIOS signal object, triggered when the next full input and empty output frame are ready for processing. This particular implementation relies upon statically-configured pipe callbacks to the kernel function SIG_andn,which clear individual bits in the signal mailbox representing this pair of triggering conditions. Once dispatched, the signal handler resets the mailbox to its initial non-zero value before retrieving descriptors for the next set of frames and then invoking the algorithm itself. (See the following code sample)
At the opposite ends of the input and output pipe from the compression signal lie a
pair of interrupt service routines which manage the underlying hardware peripherals that
ultimately produce and consume the data streams processed by the algorithm.
Notwithstanding differences in the implementation of these routines reflective of the
underlying peripherals - say, whether a hardware FIFO or DMA controller can mitigate a
less efficient interrupt-per-point approach in favor of a single interrupt on frame
boundaries - these interrupt threads invariably exchange full and empty data frames with
the signal thread through analogous pairings of PIP operations. Since PIP_free and PIP_put will implicitly callback to SIG_andn
which in turn will post the compression signal when its mailbox converges to 0, this
segment of the interrupt routine must be appropriately bracketed with HWI_enter and HWI_exit macros to
ensure the kernel gains control upon return and performs the necessary context switch.
(See code sample below.)
Table II quantifies the overall performance of the DSP/BIOS kernel within this sample
application, focusing exclusively on the MIPS and memory overhead incurred through usage
of kernel objects and APIs under various execution scenarios. All MIPS and memory figures
are given for a TMS320C54x DSP, assuming code and data reside in on-chip SARAM. The first
row delineates a reference point for the remainder of the table, and represents the
overhead introduced by the DSP/BIOS kernel objects and APIs depicted in the earlier
program examples. Note that the 936 words of program ROM encompasses all kernel functions,
but does not include the application program itself (which could be arbitrarily large).
Likewise, the data RAM figures in the table only include the kernel's internal memory
needs plus those of any pre-configured program objects - signals, pipes, accumulators,
etc. - and do not take into account any space for tables or arrays already required by the
application independent of DSP/BIOS; these figures also do not take into account the size
of the internal message buffer associated with a special system LOG object included in the
baseline, since the size of this buffer is ultimately a configurable parameter.
Table II. TMS320C54x performance numbers. The processor utilization figure of .09 MIPS sums all of the kernel functions directly or indirectly invoked by the sample application during its basic 8 ms processing cycle, and includes the total number of instructions required to enqueue/dequeue data frame descriptors from the pair of program pipe objects as well as to switch context to the foreground compression signal upon return from the posting interrupt routine. For consistency, this figure does not reflect the MIPS consumed by the compression algorithm itself along with any hardware-specific processing required in the interrupt routines - cycles intrinsic to the application itself and its underlying hardware platform. The second row introduces real-time clock support through a 1 ms (millisecond) timer interrupt controlled by the CLK module, used subsequently for statistics accumulation as well as to serve as an underlying time base for driving PRD_tick to periodically execute another application-level function (say, at twice the rate of the 8 ms (millisecond) compression signal) as reflected in the third row of the table. Not surprisingly, introduction of the 4 ms (millisecond) periodic function requires extra stack space to accommodate the additional level of signal preemption. The next two rows quantify the overhead introduced by enabling automatic statistics accumulation and event logging within the DSP/BIOS kernel, utilizing extent STS accumulator objects already associated with the compression signal and periodic function threads as well as a system LOG object pre-allocated as part of the baseline configuration. The .06 MIPS consumed by statistics accumulation results from tallying each thread's execution latency with internally paired calls to the kernel functions STS_set and STS_delta using the high-resolution clock value returned by CLK_gethtime. The .06 MIPS consumed by event logging similarly results from a trio of internal LOG_event calls as each of the two threads is triggered, dispatched, and terminated. The last row of the table summarizes the time/space overhead that ensues by introducing a HST data channel object used, in this case, to probe the application's input pipe and stream its contents to a host file. While seemingly large, the 155 words of supplementary RAM required to implement this capability is dominated by a pair of 64-point frames that would hold the data itself. Similarly, the figure of .41 MIPS not only includes an internal quartet of PIP operations invoked on frame boundaries but also folds in a "worst-case" scenario in which the real-time host/target link generates an interrupt-per-point at the underlying 8 kHz rate. At less than 1 MIPS of total overhead with all modes of automatic instrumentation enabled, the DSP/BIOS kernel can find a place in even the most resource-constrained of DSP applications. Besides furnishing standard run-time services for structuring an application program in the manner illustrated earlier - with baseline kernel overhead of only .06 MIPS - the true value proposition demonstrated by this example becomes the small incremental cost of the instrumentation itself, giving weight to the claim that DSP/BIOS can serve as essential infrastructure for a broad system test and diagnostic strategy for field and factory alike. |