# A 10-Mbps 0.8-pJ/bit Referenceless Clock and Data Recovery Circuit for Optically Controlled Neural Interface System

Sunkwon Kim, Jong-Kwan Woo, *Member, IEEE*, Woo-Yeol Shin, Gi-Moon Hong, Hyongmin Lee, Hyunjoong Lee, *Student Member, IEEE*, and Suhwan Kim, *Senior Member, IEEE*

*Abstract***—We propose a low-voltage low-power clock and data recovery (CDR) circuit which incorporates a relaxation-based voltage-controlled oscillator and clock-edge modulation, which eliminates the need for an external reference clock without allowing harmonic locking. This CDR supports input data rates between 200 kbps and 10 Mbps at 0.7 V and operates up to 24 MHz at 1.0 V. The proposed design consumes 8** *μ***W at an input data rate of 10 Mbps and achieves 0.8 pJ/bit of energy per bit even though the circuit is implemented in a 0.18-***μ***m CMOS technology.**

*Index Terms***—Clock and data recovery (CDR), clock-edge modulation (CEM), neural interface, referenceless CDR, relaxation oscillator.**

# I. INTRODUCTION

**N**EURAL INTERFACE system as a biomedical system performs signal acquisition, amplification, and filtering and also undertakes quantization and neural stimulation [1], [2]. It needs to be particularly small and power efficient while supporting data transfer rates in the tens of megabits per second [3], [4]. In such a system, the serial data interface and associated clock circuit must operate at the highest frequency of any component in the whole die. This requires a lot of power; thus, the heat generated by the circuit must be considered, in order to avoid the possibility of tissue damage. It has been suggested that the power-to-area ratio should not exceed 80 mW/cm<sup>2</sup> [5]. There are obvious drawbacks to batterypowered implants, including size and biocompatibility issues. The data are usually transmitted by RF telemetry. However, high data rates are not achievable by RF links due to the potential for interference from and with other devices. In a development designed to address this problem, a system

Manuscript received July 17, 2012; revised August 4, 2012 and September 26, 2012; accepted October 21, 2012. Date of publication January 16, 2013; date of current version March 11, 2013. This work was supported in part by the Industrial Source Technology Development Program of the Ministry of Knowledge Economy of Korea under Grants 10033657 and 10033812, and in part by the Smart IT Convergence System Research Center funded by the Ministry of Education, Science and Technology as Global Frontier Project. This brief was recommended by Associate Editor G. Yuan.

S. Kim, W.-Y. Shin, G.-M. Hong, H. Lee, H. Lee, and S. Kim are with the Department of Electrical Engineering, Seoul National University, Seoul 151- 744, Korea (e-mail: suhwan@snu.ac.kr).

J.-K. Woo was with the Department of Electrical Engineering, Seoul National University, Seoul 151-744, Korea. He is now with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109 USA.

Color versions of one or more of the figures in this brief are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSII.2012.2234872



Fig. 1. Block diagram of optically controlled neural interface system.

receives data by free-space optics. For example, [6] includes a low-power CDR circuit, which consumes 217 nW at a 3b4b encoded input data rate of 200 kbps. However, a more flexible CDR that is able to lock over a wide range of data rates (typically, 45–200 kbps) requires an expensive external clock to be provided by a separated optical diode. To make it worse, the wavelength of this external clock must be different to the wavelength of the optical data to avoid crosstalk.

Fig. 1 is a block diagram of an optically controlled neural interface system consisting of a transimpedance amplifier, a CDR, a controller, a driver, and readout and stimulus circuits. It can read out neural signals and stimulate neurons through microelectrode array. The data and commands of our system are optically transmitted at high data rate.

In this brief, we focus on the CDR circuit itself since it is one of the most power-consuming blocks. Our CDR does not need an external reference clock and achieves 0.8 pJ/bit of energy per bit even though the circuit is implemented in a  $0.18 - \mu m$  CMOS technology.

The rest of this brief is organized as follows. In Section II, we introduce the optically controlled referenceless CDR and the addressed mechanism of our clock-edge modulation (CEM) and the design of each block. Section III provides details of circuit implementation. In Section IV, we present experimental results obtained from the new CDR, and conclusions are drawn in Section V.

# II. CDR ARCHITECTURE

Research on CDRs for chip-to-chip transfer has mainly been focused on performance issues such as input data rate, jitter, and jitter tolerance [7]. However, a simpler CDR with lower power consumption that supports lower data rates cannot be obtained by scaling high-performance designs. In a wireless biomedical application, whether communication is RF or optical, the design of a synchronous CDR must take into account



Fig. 2. Block diagram of the proposed referenceless CDR.





Fig. 3. Timing diagram of CEM.

crosstalk between the data and the clock. A synchronous CDR is an expensive part of an optical system, because it requires optics for several wavelengths, together with bandpass filters. Thus, a referenceless CDR [8]–[11] is more appropriate for this application.

The designs of CDR that do not have a reference clock require a complicated frequency detection circuit, which consumes a lot of power and area. An alternative is the clockembedded CDR with additional voltage levels, which extracts the clock signal using information embedded in the data stream itself. This approach simplifies clock recovery without introducing the possibility of harmonic locking, which suggests that it should be suitable for pad-limited design. However, this clock-embedded CDR, which extracts data information by means of additional voltage levels [10], makes it unsuitable for our application because of the limited dynamic range of a photodiode. The alternative is a CEM technique which can extract data information from clock signal by repositioning the clock edge [11].

We propose a low-power referenceless CDR architecture that achieves CEM with a single phase. As shown in Fig. 2, a clock recovery loop and sampling flipflop triggered by falling edge are used to generate the clock and data. The clock recovery loop is based on a phase-locked loop (PLL) with a phase frequency detector (PFD), which is easily implemented due to the presence of CEM input signal. Fig. 3 shows the timing diagram of the CEM. The positive edges of the input signal contain both period and phase information. The negative edges of the input signal are related to a single bit of data. The PFD only extracts clock information for the positive edges of the input signal and compares the recovered clock with this CEM input signal. The UP/DN signals from the PFD identify the leading and lagging phases of the recovered clock, and these signals are sent to the charge pump (CP). Then, the voltage from the CP changes the period of the relaxation oscillator. Thus, the lead or lag between the input signal and the recovered clock determines the period

Fig. 4. Schematic diagram of phase and frequency detector.

and phase of the oscillator. This allows the recovered clock signal to be locked without placing any limit on the run length of the clock information. In the locked state, accurate data are obtained in quite a simple way, by sampling the input signal at the falling edge of the internal clock in the locked state. If the positions of a falling edge for zero or one are within 75% and 25% of the period, respectively, then the CDR keeps the jitter of the recovered clock signal below 0.25 UI.

# III. CIRCUIT IMPLEMENTATION

### *A. Phase and Frequency Detector and CP*

The PFD compares two input signals in terms of both phase and frequency [12]. The output is a pulse proportional to the phase difference between the inputs, and this output drives the CP to adjust the control voltage of the voltage-controlled oscillator (VCO). The phase characteristic of the PFD is important, as it is linked to jitter in the PLL. If the PFD fails to detect the phase error when it is within the dead zone, which is the undetectable phase difference range, then the PLL will lock to an incorrect phase [13].

We adopt a PFD with a reset pulse which has a reduced dead zone. The reset pulse applied to the PFD reduces the jitter arising from up and down current mismatch in the CP when it is operating at low voltage. The simple feedback keeper technique improves the noise immunity of the PFD. In our design, the keeper transistors also reduce the power requirement of the PFD and increase its speed. Fig. 4 shows our PFD design, in which the dead zone is effectively reduced by the gate and asynchronous delays.

Fig. 5 shows a finite-state diagram of its four possible states, in which each state transition is annotated with the corresponding transition condition, which is basically the rising transition of the REF or CLK signals, denoted by REF  $\uparrow$  and CLK  $\uparrow$ , respectively. Let us assume that UPB and DNB are both high (state  $= 11$ ) initially, while REF and CLK are low. Then, a



Fig. 5. Finite-state diagram of the PFD.



Fig. 6. Schematic diagram of the CP.

rising edge of REF will drive UPB low, and a rising edge of CLK will drive DNB low. When both UPB and DNB are low, the circuit is reset, returning UPB and DNB to high. State 00 is unstable in Fig. 5. In this state, reset will turn on M4 and M12, nodes X1 and Y1 are discharged accordingly, and nodes X2 and Y2 will be charged to high through M5 and M13, respectively. This returns the circuit to state 11, in which the keepers M8 and M16 precharge nodes X2 and Y2, respectively, so as to stabilize dynamic nodes X2 and Y2, particularly when the voltage is low.

Fig. 6 is a schematic diagram of the CP circuit, which has a single-ended source-switched architecture. The voltage  $V_{\text{ctrl}}$ is determined by switches M1 and M6, which control outputs UPB and DNB of the PFD, respectively. Switches M2 and M4 are included to minimize the current mismatch due to charge sharing. To improve the accuracy of current mirroring, dummy switches are used in the bias branches.

#### *B. Voltage-Controlled Relaxation Oscillator*

The quartz crystal oscillators are large and consume a lot of power. Therefore, the ring or relaxation oscillators are preferable for implanted biomedical devices. Ring oscillators are commonly able to cope with higher frequencies and consume more power than relaxation oscillators [14]. However, relaxation oscillators have two more advantages over ring oscillators: They have a constant frequency tuning gain, and their phase can be read out continuously due to their triangular (or sawtooth) waveform [15]. The period of a relaxation oscillator is determined in a well-defined manner, at a modest cost in power and silicon area. We therefore adopt a relaxation oscillator for our VCO, as shown in Fig. 7. A relaxation oscillator produces a constant frequency by charging and discharging capacitors between two fixed voltages. The VCO in our CDR has a



Fig. 7. Schematic diagram of our voltage-controlled relaxation oscillator.

constant charge voltage and a controllable discharge voltage. The VCO consists of a constant-current source  $I_{ref}$  which is controlled by the recovered clock signal and two comparators that compare the voltages on capacitors  $V_{C1}$  and  $V_{C2}$  with the control voltage  $V_{\text{ctrl}}$ . A set–reset (SR) latch receives the output voltages from the two comparators, and the output of the latch provides feedback that turns the constant-current source on and off. As shown in Fig. 7,  $C_1$  and  $C_2$  are alternately charged to  $V_{\text{DD}}$ , controlled by the state of the SR latch, and then discharged to  $V_{\text{ctrl}}$  by  $I_{\text{ref}}$ . The triangular waveforms of  $V_{C1}$  and  $V_{C2}$  form one period of the clock. The length  $T_{\text{OSC}}$  of this period is determined as follows:

$$
T_{\rm OSC} = \frac{2C(V_{\rm DD} - V_{\rmctrl})}{I_{\rm ref}} \tag{1}
$$

where  $I_{ref}$  is a constant current,  $V_{ctr}$  is the control voltage, and  $C = C_1 = C_2$ . According to (1), mismatches of two  $I_{REF}$ and capacitors can affect the duty cycle of recovered clock. The VCO gain  $(K_{VCO})$  is 83 MHz/V.

At the normal operating voltage  $V_{\text{DD}}$ , symmetry requires that  $V_{\text{out}} = V_X$ . If  $V_{C1}$  or  $V_{C2}$  is much more positive than  $V_{\text{ctrl}}$ , then M4 operates in its deep triode region, carrying zero current. Thus,  $V_{\text{out}} = V_{\text{DD}}$ . As  $V_{C1}$  or  $V_{C2}$  approaches  $V_{\text{ctrl}}$ , M1 turns on, drawing part of  $I_{D5}$  from M3 and turning M4 on. The output voltage then depends on the difference between  $I_{D4}$ and  $I_{D2}$ . As  $V_{\text{DD}}$  drops, so do  $V_X$  and  $V_{\text{out}}$ , with a slope close to unity. As  $V_X$  and  $V_{\text{out}}$  fall below  $V_{\text{ctrl}} - V_{\text{THN}}$ , M1 and M2 enter their triode regions, but their drain currents are constant if M5 is saturated. A further decrease in  $V_{\text{DD}}$  and, hence, in  $V_X$ and  $V_{\text{out}}$  causes  $V_{\text{GS1}}$  and  $V_{\text{GS2}}$  to increase, eventually driving M5 into the triode region. Thereafter, the bias current of all of the transistors drops, lowering the rate at which  $V_{\text{out}}$  decreases to the logic threshold. In our design, the logic threshold of the latch is lowered to operate at a low supply voltage. Thus, the UP/DN signals are determined by the lead or lag of the phase at the output of the PFD. This modifies  $V_{\text{ctrl}}$ , and the frequency of the relaxation oscillator is adjusted to align the phases, as shown in Fig. 8.

# *C. BER Test Circuit*

To verify the functionality of our CDR and establish its bit error rate (BER), the test circuit shown in Fig. 9 which generates



Fig. 8. Timing diagrams of (a) the lock state and (b) the lag state.



Fig. 9. Schematic diagram of the BER test circuit.



Fig. 10. Die photo.

pseudorandom binary sequences (PRBSs) of length  $2^{15} - 1$ , was embedded in a design under test. The same known data pattern is sent to the CDR circuit. The feedback polynomial of the PRBS has terms with powers of 14 and 15. The retimed and expected data are compared to generate a bit error signal, and the BER can be calculated from this signal and the running time.

# IV. EXPERIMENTAL RESULTS

The proposed CDR was implemented using a  $0.18 - \mu m$  process. Fig. 10 shows the die photograph. The loop filter, which occupies most of the die area, is a second-order RC filter for stability. The core area is  $0.09 \text{ mm}^2$ , including some test circuitry. Fig. 11(a) is a screenshot of the oscilloscope when the supply voltage is 0.7 V and the input data rate is 10 MHz. The input pattern is "1001010," which is part of a  $2^{15} - 1$  PRBS. The bit error signal indicates that the CDR is not yet ready. As shown in Fig. 11(b), our CDR has a settling time of 25  $\mu$ s.

Fig. 12 is a plot of power consumption and a figure-of-merit (FoM) versus data transfer rate. This FoM is the ratio of power consumption to data transfer rate. Our CDR supports input data rates from 200 kbps to 10 Mbps at a supply voltage of 0.7 V, consumes  $8 \mu W$  at an input data rate of 10 Mbps, and has FoM of 0.8 pJ/bit.

Since the chip area is  $0.09 \text{ mm}^2$ , we have a power-toarea ratio of about 8.8 mW/cm<sup>2</sup>, which is almost an order of magnitude less than the critical heat flux threshold for implants



Fig. 11. Measured waveforms of (a) the recovered clock and retimed data and (b) the settling time.



Fig. 12. Measured data transfer rate versus power consumption and FoM.

of 80 mW/cm<sup>2</sup>[5]. When the supply voltage is 1.0 V, normal operation allows data rates from 200 kbps to 24 Mbps. We measured the power consumption using an input pattern which produces the maximum number of transitions. Fig. 13(a) and (b) shows a histogram of measurements of the recovered clock period of 100 ns and duty cycle, respectively. The standard deviation in recovered clock period is 0.54 ns, and that in duty cycle is 0.31%. During a 32-h test, no error bits were detected, suggesting that the BER is lower than  $10^{-13}$ . The data rate was 10 Mbps, and the circuit was operating with a supply voltage of 0.7 V and a  $2^{15} - 1$  PRBS input pattern. Measured parameters for this CDR are compared with those of previous devices in Table I. Our CDR uses the least energy at 0.8 pJ/bit and achieves the highest data rate, which is 10 Mbps at a supply voltage of 0.7 V.



Fig. 13. Measured histogram of (a) the recovered clock period and (b) duty cycle of recovered clock.

TABLE I COMPARISON WITH OTHER REPORTED CDRS

|              | VLSI'09[16]  | ISSCC'10[6]     | This work    |
|--------------|--------------|-----------------|--------------|
| Process      | $0.18 \mu m$ | 90nm            | $0.18 \mu m$ |
| Design       | Demodulator  | <b>CDR</b>      | <b>CDR</b>   |
| Data rate    | 250kbps      | 200kbps         | 10Mbps       |
| $\rm V_{dd}$ | 0.7V         | 0.3V            | 0.7V         |
| Power        | $21 \mu W$   | 217nW           | $8.05 \mu W$ |
| FoM          | 84pJ/bit     | 1pJ/bit<br>$-1$ | $0.8$ pJ/bit |

3b4b coding and reference clock

# V. CONCLUSION

We have designed and experimentally validated an  $8-\mu$ W 10-Mbps referenceless CDR circuit with CEM for biomedical system, which operates at a supply voltage of 0.7 V. This design obviates the need for an external reference clock but is not subject to harmonic locking. A VCO based on a relaxation oscillator is used to reduce power consumption.

Our CDR has an input data rate between 200 kbps and 10 Mbps when the supply voltage is 0.7 V and operates up to 24 MHz with a supply voltage of 1.0 V. The BER of our CDR is lower than  $10^{-13}$ . The energy per bit is only 0.8 pJ/bit, even though the circuit is implemented in a  $0.18 - \mu m$  CMOS technology.

#### **REFERENCES**

- [1] H. Yang and R. Sarpeshkar, "A bio-inspired ultra-energy-efficient analogto-digital converter for biomedical applications," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 53, no. 11, pp. 2349–2356, Nov. 2006.
- [2] A. Gerosa and A. Neviani, "A 1.8- $\mu$ W sigma-delta modulator for 8-bit digitization of cardiac signals in implantable pacemakers operating down to 1.8 V," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 52, no. 2, pp. 71–76, Feb. 2005.
- [3] C. T. Charles, "Wireless data links for biomedical implants: Current research and future directions," in *Proc. IEEE BIOCAS Conf.*, 2007, pp. 13–16.
- [4] K. Jones and R. Normann, "An advanced demultiplexing system for physiological stimulation," *IEEE Trans. Biomed. Eng.*, vol. 44, no. 12, pp. 1210–1220, Dec. 1997.
- [5] T. M. Seese, H. Harasaki, G. M. Saidel, and C. R. Davies, "Characterization of tissue morphology, angiogenesis, temperature in the adaptive response of muscle tissue to chronic heating," *Lab. Invest.*, vol. 78, no. 12, pp. 1553–1562, Dec. 1998.
- [6] T. Kleeburg, J. Loo, N. J. Guilar, E. Fong, and R. Amirtharajah, "Ultralow-voltage circuits for sensor applications powered by free-space optics," in *Proc. IEEE ISSCC Tech. Dig. Papers*, 2010, pp. 502–503.
- [7] C. Hsieh and S. Liu, "A 1-16-Gb/s wide-range clock/data recovery circuit with a bidirectional frequency detector," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 58, no. 8, pp. 487–491, Aug. 2011.
- [8] D. Dalton, K. Chai, E. Evans, M. Ferriss, D. Hitchcox, P. Murray, S. Selvanayagam, P. Shepherd, and L. DeVito, "A 12.5 Mb/s to 2.7 Gb/s continuous-rate CDR with automatic frequency acquisition and data-rate readback," in *Proc. IEEE ISSCC Tech. Dig. Papers*, 2005, pp. 230–595.
- [9] R. Yang, K. Chao, S. Hwu, C. Liang, and S. Liu, "A 155.52 Mbps– 3.125 Gbps continuous-rate clock and data recovery circuit," *IEEE J. Solid-State Circuits*, vol. 41, no. 6, pp. 1380–1390, Jun. 2006.
- [10] M. Park, Y. Lee, J. Lim, B. Hong, T. Kim, H. Nam, H. Song, D. Jeong, and W. Kim, "An advanced intra-panel interface (AiPi) with clock embedded multi-level point-to-point differential signaling for large-sized TFT-LCD applications," *SID Symp. Tech. Dig. Papers*, vol. 37, no. 1, pp. 1502–1505, Jun. 2006.
- [11] W. Choe, B. Lee, J. Kim, D. Jeong, and G. Kim, "A single-pair serial link for mobile displays with clock edge modulation scheme," *IEEE J. Solid-State Circuits*, vol. 42, no. 9, pp. 2012–2020, Sep. 2007.
- [12] D. M. W. Leenaerts, C. Vaucher, and J. van der Tang, *Circuit Design for RF Transceivers*. Norwell, MA: Kluwer, 2001.
- [13] H. Johansson, "A simple precharged CMOS phase frequency detector," *IEEE J. Solid-State Circuits*, vol. 33, no. 2, pp. 295–299, Feb. 1998.
- [14] Y. Tokunaga, S. Sakiyama, A. Matsumoto, and S. Dosho, "An on-chip CMOS relaxation oscillator with voltage averaging feedback," *IEEE J. Solid-State Circuits*, vol. 45, no. 6, pp. 1150–1158, Jun. 2010.
- [15] P. F. J. Geraedts, E. van Tuijl, E. A. M. Klumperink, G. J. M. Wienk, and B. Nauta, "A 90  $\mu$ W 12 MHz relaxation oscillator with a  $-162$  dB FOM," in *Proc. IEEE ISSCC Tech. Dig. Papers*, 2008, pp. 348–350.
- [16] J. Bae, N. Cho, and H.-J. Yoo, "A 490  $\mu$ W fully MICS compatible FSK transceiver for implantable devices," in *Proc. IEEE Symp. VLSI Circuits*, 2009, pp. 36–37.