Multi-Tone Signaling for High-Speed Links

Limits, Design, Modeling, Analysis, and Implementation

 

Amir Amirkhany

Sponsored by Rambus, Inc


Introduction, Awards, Publications, Talks, and Related Past Project


Introduction

This project is motivated by the observation that projected data rates and power-efficiency of state of the art baseband links are not promising [1]. Multi-tone signaling, on the other hand, has the reputation of being an efficient capacity-achieving transmission method. In addition, MT has the potential to reduce effective processing rate by parallelizing the data stream in frequency domain. Fig. 1 shows a conceptual Multi-Tone system.

 

              Figure 1: Conceptual MT system

 

The following steps were taken to verify the potential:

 

  1. Theoretical Limits:
    1. Baseband (BB) Gap: Comparison of Shannon capacity with a baseband system with unconstrained equalizer complexity [1]
    2. Discrete Multi-Tone for links: Hardware requirements and potential data rates for block size constrained DMT [5]
    3. Frequency Division Multiplexing (FDM) for links: Potential data rates with a FDM-type MT architecture with small number of channels [2]

 

  1. System Architecture [3] (Best Student Paper Award, GlobeCom 2006):
    1. Design: An FDM-based MT architecture with full-band transmit equalization and MIMO receive DFE. Comparable complexity with BB alternatives, but capable of achieving higher data rates.
    2. Analysis: Convex optimization framework for analysis
    3. Modeling: Closed form modeling of transmitter and receiver jitter
    4. Adaptive solution: A two step BER constrained ZFE solution suitable for adaptation

 

  1. Hardware Architecture and Characterization:
    1. System specification: Universal transmitter supporting 4-channel and 2-channel AMT and a variety of BB modes including 4, 16, and 256 PAM, all at 24Gbps [7]
    2. Characterization: a Least Squares characterization method based on Volterra Series for wideband circuit characterization [4]

 

  1. Implementation:
    1. Implementation of a 24Gbps, multi-channel digital equalizer [6]

 

These steps are described in more detail in the following.

Theoretical Limits

Figure 2(a) shows two typical 26” link channels. The NELCO channel has relative smooth spectral characteristics while the FR4 channel has a notch at around 4GHz. A DMT system with infinite block-size and optimum integer bit-loading, as shown in Figure 2(b), can achieve 38Gbps and 63Gbps over FR4 and NELCO channels respectively. A BB system with infinite equalizer complexity can achieve the data rates shown in Figure 2(c). The squares on the left side of the figure correspond to the FR4 channel and those one the right correspond to the NELCO channel. For both MT and BB analysis, system was cast as a convex feasibility problem and optimum bit-loading for the MT system was obtained using Levin-Campello algorithm. Results show that if unlimited signal processing is available, in a thermal noise limited environment, MT can improve the performance of the links by 50%-100%, with more potential over a channel with a notch. [1]

                                                                 (a)                                                                            (b)                                                                    (c)

Figure 2: (a) Frequency response of typical link channels (b) Optimum integer bit-loading for infinite block-size DMT (c) achievable BB data rates with infinite equalizer complexity

 

Figures 3(b) and 3(c) show the maximum achievable data rate in a block-size constrained DMT system as a function of utilized channel bandwidth (half the transmitter sample rate) over FR4 and NELCO channels respectively. Due to the limited block size, for any system sample rate, an optimum cyclic prefix length exists which is a compromise between prefix overhead and inter-block interference in the system. The optimum prefix length is chosen for the plots in Figures 3(b) and 3(c). Results show that in order to achieve reasonable performance, DMT block size should exceed 64 in both cases. In addition, an Analog to Digital Converter with a sampling rate of around 12GHz and at least 6-7 bits of resolution is necessary. These requirements rule out DMT as a power-efficient architecture for high-speed links. The analysis of a block-size constrained DMT system is very different from the infinite size DMT in Figure 2 in that inter-block interference should be accurately modeled into the system. Again the problem was cast as a convex optimization problem and optimum bit-loading was obtained through a greedy-optimization algorithm based on a simple heuristic. Results overall indicate that a MT architecture customized to link characteristics is necessary. [5]

 

        

                                                                                  (a)                                                                                                      (b)

Figure 3: Maximum DMT data rate (with optimum cyclic prefix length) as a function of utilized channel bandwidth over (a) FR4 (b) NELCO

 

Figure 4(a) shows a Multi-Tone system which is a variation of Frequency Division Multiplexing (FDM). Analog low-pass filters and mixers are used to create sub-channels at certain carrier frequencies (called Analog MT for this reason). This architecture is motivated by the observation that link channels are relatively well-behaved and only a few sub-channels can achieve close to optimum power allocation. Since building sharp analog on-chip filters is not possible, the sub-channels in this system interfere with each other and Inter-Channel Interference (ICI) exists as well as Inter-Symbol Interference (ISI). A MIMO linear equalizer at the transmitter and a MIMO DFE at the receiver cancel both ISI and ICI. With 8 sub-channels for the MT system and fair assumptions for jitter and thermal noise for both BB and MT, it was shown that MT has the potential to achieve 70%-100% higher data rate compared to BB. Figure 4(b) shows the bit-loading for the MT system and the optimum constellation size and symbol rate for the BB system. [2]

                                                                                      (a)                                                                                                                                 (b)

Figure 4: (a) An FDM-type MT system (b) Optimum bit-loading for MT and optimum configuration for BB

System Architecture

Figure 5(a) shows a simplified AMT architecture. The Low Pass Filters, mixers and MIMO linear equalizer are all replaced by N-times over-sampled equalizers per sub-channel (N is number of sub-channels). This way, each sub-channel has full control over the entire transmission band-width. Consequently when all sub-channels are optimized together, they can prevent ICI as well as ISI. Receiver low-pass filters are replaced with integrators and MIMO DFE in the receiver cancels post-cursor interference. Over an ideal channel, the transmit equalizer would just perform the mixing and the system reduces to an OFDM system. An important characteristic of the architecture is that spectral characteristics of the system (sub-channel bandwidth, center frequencies, guard-band) can be easily scaled by just changing the input clock to the system.

In the absence of MIMO DFE, it is possible to view the whole system as a trans-multiplexer system (the dual of a perfect reconstruction) system. For a given a set of receive filters, the transmitter filters are optimized to create a perfect reconstruction system. In fact the MIMO linear equalizer in Figure 4(a) together with transmitter mixers can be viewed as poly-phase implementation of the N-times over-sampled linear equalizer in Figure 5(a). With MIMO DFE in the system, the whole system can be viewed as a “controlled reconstruction” system. The important point here is that significant spectral overlap may exist between the sub-channels, as shown in Figure 5(a), as long as interference is zero at the sampling points. Also, integrators are not the only filters that can be employed. Any set of perfect (controlled) reconstruction filters can substitute the integrators. Figure 5(b) shows an example of a set of four 3rd order passive filters with reasonable characteristics.

 

                                                                                   (a)                                                                                                                                          (b)

Figure 5: (a) An AMT system with N-times over-sampled linear transmit equalizers. (b) Frequency response of a set of 3rd order passive filters with “controlled reconstruction” property.

Since link systems are designed for very low un-coded bit error rates, it is essential to model the system and all important noise sources very accurately. Closed form expressions for transmitter and receiver jitter variances at receiver outputs, a Second Order Cone (SOC) formulation for the whole system to find globally optimum transmitter and receiver tap values, BER constrained Zero Forcing solution suitable for adaptation, and performance comparison with BB alternatives are described in [3].

Figure 6 shows the frequency response of the sub-channels of a 3-channel MT system at the receiver outputs just before the samplers. The BB sub-channel is a 4PAM and the two passband sub-channels are 2PAM. The channel represents a link channel with one via stub. The figure also shows the frequency responses of 4PAM and 2PAM BB systems operating at their maximum signaling rate. Even though significant overlap exists between MT sub-channels, the equalizer make sure interference is zero at the sampling point.

Figure 6: Sub-channel frequency responses at receiver output (just before samplers) for a (4PAM, 2PAM, 2PAM) MT system, and for 4PAM and 2PAM BB systems.

Hardware Architecture

Figure 7(a) shows the architecture of the transmitter of an AMT system with 16 taps 4x over-sampled equalizer per sub-channel. Assuming that input constellation per-sub-channel is 2PAM or 4PAM and each sub-channel operates at for example 3GSym/Sec, the system is capable of transmitting up to 24Gbps of data. Implementation of the equalizer in digital domain makes this architecture very general. Figures 7(b), 7(c), and 7(d) show how same hardware can be configured to operate as 4-channel multi-tone system (3GSym/Sec per sub-channel), as a 4PAM BB system (12GSym/Sec), and as a 16PAM BB system (6GSym/Sec). 256PAM at 3GSym/Sec and 2-channel multi-tone at 6GSym/Sec per sub-channel is also possible. All these configurations lead to same overall data rate of 24Gbps.

(a)

                                                 (b)                                                                      (c)                                                                       (d)

Figure 7: (a) Multi-channel, 16-tap per sub-channel transmitter. (b),(c),(d) Transmitter configured as (only 8 taps shown) AMT with 4-sub-channels, 4-way parallelized 2PAM/4PAM BB, and 2-way parallelized 8PAM/16PAM BB.

Circuit characterization methods like static INL and DNL do not provide very useful description of the dynamic behavior of the system. Other dynamic measures like IP2 and IP3 are developed for narrow-band system and applying them to wideband circuits is not very meaningful. Metrics like settling time are also defined to characterize a block by itself, and not as a part of a bigger system, possibly with sophisticated signal processing capabilities. For a communication system, in particular, a characterization method would be ideal that describes the effect of non-idealities in the system as a dynamic error-variance that can be easily plugged into the system analysis frameworks. In order to characterize the wideband components in the AMT system, a Least Square Estimation method was developed based on Volterra Series representation of weakly non-linear signals that can decompose a hardware component to an impulse response, 2nd, 3rd, or higher non-linear terms as well as a cyclo-stationary offset (due to clock injection or cyclo-stationary supply noise). Figure 8(a) shows impulse responses corresponding to 1st, 2nd, and 3rd order non-linearity of an 8-bit DAC. Figure 8(a) shows the spectral response of the DAC compared to an ideal zero-order hold DAC. Figure 8(c) shows the estimated cyclo-stationary offset imposed on a differential clock. [4]

This method can be extended to characterizing the wideband dynamic behavior of an entire transmitter, for example. In addition, abstracted non-linear terms can help build effective non-linear equalizers can cancel analog non-idealities in analog domain.

                                                (a)                                                                                  (b)                                                                                    (c)

Figure 8: (a) Estimated impulse response, and 2nd and 3rd order nonlinearity response of the DAC. (b) Estimated frequency response of the DAC. (c) Estimated cyclo-stationary offset.

Implementation

The digital equalizer in the transmitter of the AMT system performs 16 2-bit by 10-bit multiplications and 15 10-bit additions in one 12GHz cycle and is clocked by a 1.5GHz clock. Multiplications are performed using 4:1 multiplexers, and additions are performed in 3 stages of 4:2 compression. Figure 9(a) shows functional diagram and pipelining of the equalizer. Equalizer’s data-path is clocked with a 1.5GHz clock and needs full attention of a custom design.  A flow is developed using commercial ASIC design tools aided by a hierarchical Matlab placement tool to employ vast verification and automation capabilities of CAD tools while achieving the precision of full custom design (Figure 9(b)). In this flow, both critical and non-critical parts of the equalizer are handled together. The non-critical parts including low-speed clock distribution are automatically handled by the tool. For the critical part, CAD tools mainly function as script interpreters. Placement scripts are generated by a hierarchical placement Matlab code. Figure 9(c) shows different steps of hierarchical placement in Matlab. Functional verification is performed at Verilog level and comprehensive time analysis on the entire design is performed using Static Timing Analysis engines. [6]

                                                           (a)                                                                            (b)                                                                              (c)

Figure 9: (a) Functional block diagram and pipelining of one phase of the digital equalizer. (b) Automated flow for designing the equalizer. (c) Different stages of hierarchical placement Matlab code.

Figure 10 shows the layout of the transmitter and its micrograph.

             

                                                                                                  (a)                                                                                                       (b)

Figure 10: (a) Transmitter layout (b) Chip Micrograph.

 

Fig. 11 shows measured un-equalized 2PAM and equalized (only transmit equalization) 2PAM and 4PAM eye diagrams on an equivalent-time scope at 12GSym/Sec [7].

 

 

Figure 11: Un-equalized 2PAM, equalized 2PAM, and equalized 4PAM, all at 12GSym/Sec.

 

Fig. 12 shows AMT eye-diagrams sampled at the end of same cables and post processed (mixing and integration) in Matlab. Only transmit equalization employed [7].

 

Figure 12: AMT eye diagrams post processed in Matlab at 18Gbps (3GSym/Sec per channel).

Awards

Best Student Paper Award, GlobeCom 2006 (Analog Multi-Tone Signaling for High-Speed Backplane Electrical Links)

Publications

[1] V. Stojanović, A. Amirkhany and M.A. Horowitz, “Optimal linear precoding with theoretical and practical data rates in high-speed serial-Link backplane communication,” IEEE International Conference on Communications, June 2004.

[2] A. Amirkhany, V. Stojanović and M.A. Horowitz, “Multi-tone Signaling for High-speed Backplane Electrical Links,” GlobeCom, Nov 2004.

[3] A. Amirkhany, A. Abbasfar, V. Stojanović and M.A. Horowitz, “Analog Multi-Tone Signaling for High-Speed Backplane Electrical Links,” GlobeCom, Nov 2006.

[4] J. Savoj, A. Abbasfar, A. Amirkhany, and M.A. Horowitz, “A New Technique for Characterization of Data Converters in High-Speed Systems,” Design Automation and Test inEurope, April 2007.

[5] A. Amirkhany, A. Abbasfar, V. Stojanović and M.A. Horowitz, “Practical Limits of Multi-Tone Signaling over High-Speed Backplane Electrical Links,” IEEE International Conference on Communications, June 2007.

[6] A. Amirkhany, M. Jeeradit, A. Abbasfar, J. Savoj, B. Garlepp, V. Stojanović, and M.A. Horowitz, “Automated Design of a 3GHz, 24Gbps Digital Equalizer,” Submitted to Design Automation Conference, June 2007.

[7] A. Amirkhany, A. Abbasfar, J. Savoj, M. Jeeradit, B. Garlepp, V. Stojanović, and M.A. Horowitz, “A 24Gb/s Software Programmable Multi-Channel Transmitter,” VLSI Symposium, June 2007. (Available on request – cannot post due to VLSI’s policy)

[8] J. Savoj, A. Abbasfar, A. Amirkhany, M. Jeeradit, and B. Garlepp “A 12GS/S Phase-Calibrated CMOS Digital-to-Analog Converter,” VLSI Symposium, June 2007.

Talks

  1. A. Amirkhany, V. Stojanovic and M. Horowitz. Analog Multi-Tone Signaling for High-Speed Backplane Electrical Links. GlobeCom, November 2006.

This talk received the Best Student Paper Award in Global Communications Conference in 2006. It describes the architecture of a practical multi-tone system for high-speed backplane links. In addition, closed form expressions for transmitter and receiver jitter variances at receiver outputs, a Second Order Cone (SOC) formulation for the whole system to find globally optimum transmitter and receiver tap values, BER constrained Zero Forcing solution suitable for adaptation, and performance comparison with BB alternatives are described. To view the transcript, please save and open in PowerPoint.

  1. A. Amirkhany, A. Abbasfar, V. Stojanovic, and M. Horowitz. Multi-tone Signaling for High-speed Backplane Electrical Links. GlobeCom, November 2004.

This talk describes our first attempt towards a multi-tone system suitable for backplane links. It is based on an FDM-type architecture with mixers and analog filters. The tradeoffs related to the choice of analog filters are studied, system is modeled in convex framework and performance is compared with BB. To view the transcript, please save and open in PowerPoint.