# A 2.7µW 10b 640kS/s Time-Based A/D Converter for Implantable Neural Recording Interface

Amir Zjajo, Santosh Astigimath, Rene van Leuken Circuits and Systems Group Delft University of Technology Mekelweg 4, 2628 CD Delft, The Netherlands

Abstract—In this paper, we propose a time-based, programmable-gain A/D converter allowing for an easily-scalable, and power-efficient, implantable, biomedical recording system. The converter circuit is realized in a 90 nm CMOS technology, operates at 640 kS/s, occupy an area of 0.022 mm<sup>2</sup>, and consumes less than 2.7  $\mu$ W corresponding to a figure of merit of 6.2 fJ/conversion-step.

*Keywords*—A/D converter, neural recording interface, timebased signal processing.

# I. INTRODUCTION

A multi-channel [1]-[3], implantable, neural recording systems enables the interaction with neural cells to facilitate early diagnosis, and predict intended behavior before undertaking any preventive or corrective actions. However, the space to host these systems is restricted to ensure minimal tissue damage, and tissue displacement during implantation. Furthermore, power density of the entire system (including the analog front-end, signal sorting, wireless telemetry, energy harvesting, etc.) is limited to 800  $\mu$ W/mm<sup>2</sup> [4] to prevent possible heat damage to the tissue surrounding the device (and subsequently, limited power consumption prolong the battery's longevity and evade recurrent battery replacements surgeries). More specifically, analog front-end has severe power consumption requirement (i.e., 5-10  $\mu$ W/channel) as it is mounted to the electrodes, and has short heat path to the neural tissues under recording. In addition, for highperformance neural prosthetic devices, the high-density, raw data rate recording is required. A 128-channel, 10-bit-precise digitization of neural waveforms sampled at 40 kHz generates  $\sim 51 \text{ Mbs}^{-1}$  of data; the power costs in signal conditioning, quantization and wireless communication all scale with the data rate.

In this paper, we propose a time-based A/D converter (ADC), allowing for a power- and area-efficient digitization of neural waveforms. The time-mode converters based on asynchronous ADCs [5], slope and integrating ADC [6], or pulse-position modulation [7] offer high resource efficiency in terms of power and area. In time-based methodology, conventional voltage and current variables are replaced by corresponding time differences between two rising edges as the time variables, and logic circuits substitute the large-sized and power-hungry analog blocks. In deep-submicron CMOS devices, even with the supply voltage reduction, time resolution is increased due to the decrease of gate delay [8].



Figure 1: A N-channel front-end neural recording interface.

In the proposed design, a voltage signal is converted to a time-domain representation using a comparator-based switched-capacitor circuit [9] and a continuous-time comparator. To improve the power efficiency, resulting time domain information is converted to the corresponding digital code with a two-step time-to-digital converter (TDC), where fine quantization of the resulting residue is obtained with folding Vernier converter. The implementation results in a 90 nm CMOS technology show that a significant gain on throughput, resource usage and power reduction (less than 2.7  $\mu$ W corresponding to a figure of merit of 6.2 fJ/conversion-step) can be obtained for large-scale neural spike data, with a simple and compact ADC structure that has minimal analog complexity.

# II. CIRCUIT DESIGN

## A. Architectural Overiew of a Multichannel Neural Interface

The data acquired by the recording electrodes in N-channel neural recording interface is conditioned using analog circuits, as illustrated in Figure 1. The electrode is characterized by its charge density and impedance characteristics (e.g., a 36 µm diameter probe (1000  $\mu$ m<sup>2</sup>) may have a capacitance of 200 pF, equivalent to 40 k $\Omega$  impedance at 20 kHz), which determine the amount of noise added to the signal (i.e., 3.5  $\mu$ Vrms at 37°C). As a result of the small amplitude of neural signals (typically ranging from 10 µV to 500 µV and containing data up to  $\sim 20$  kHz), and the high impedance of the electrode tissue interface, low-noise amplification (LNA), band-pass filtering, and programmable gain amplification (PGA) of the neural signals is performed before the signals can be digitized by the ADC. To relax the in-channel integration density, part of the front-end electronics, usually an ADC, is moved out of the channel to the periphery of the recording area (e.g., for the sampling rate of 40 kS/s for one channel to avoid extensive interpolation of spike samples, 640 kS/s ADC can be shared among 16 channels). Consequently, a 128-channel front-end interface can be built in 16×8 configuration.

This research was supported in part by the European Union and the Dutch government, as part of the CATRENE program under Heterogeneous INCEPTION project.



Figure 2: a) Block diagram of an ADC with two-step time-to-digital conversion; single input version shown for clarity, b) the output voltage ramps to the final value in comparator-based switched-capacitor charge transfer phase, c) ADC timing signals.

The ADC output containing the time-multiplexed neural signals is further processed in a back-end DSP unit, which executes a spike sorting algorithm (to obtain data reduction and distinguish different neuronal sources). The relevant information is then transmitted to an outside receiver through the transmitter, or used for stimulation in a closed-loop framework.

# B. An ADC with Two-Step Time-to-Digital Conversion

The basic concept of the architecture, which utilizes a linear voltage-to-time converter (VTC) and a two-step time-todigital converter, is illustrated in Figure 2a). The scheme is reconfigurable in terms of input gain (through programmable capacitance  $C_2$ ), resolution (controlling the number of performed iterations) and sampling frequency (through the frequency of the input clock). Once a configuration has been selected, the bias current is also dynamically controlled during the conversion operation to adapt to the reference voltage.

A comparator-based, switched-capacitor gain stage [9] eliminates high-gain, high-speed operational amplifier from the design, and does not require stabilizing high-gain, highspeed feedback loop, reducing complexity, and the associated stability versus bandwidth/power tradeoff. The VTC converts a sampled input voltage to a pulse, whose time period is linearly proportional to the input voltage. During the charge transfer phase, the current source  $I_{XI}$  turns on, charges up the capacitor network consisting of  $C_1$  and  $C_2$ , and generate a constant voltage ramp on the output voltage Vo and, subsequently, causes the virtual ground voltage  $V_X$  to ramp simultaneously, (Figure 2b), via the capacitor divider. The voltages continue to ramp until the comparator detects the virtual ground condition ( $V_X = V_{CM}$ ), and turns off the current source. When the voltage at the sampling capacitor reach the comparator threshold, the comparator output goes high.

The circuit realization of a fully-differential comparator with digitally-programmable offset adjustment [11] is illustrated in Figure 3a). Transistors  $T_{5-8}$  employ iterated instance notation to designate 5 transistors placed in parallel. The widths of these devices are binary weighted to offer a programmable current gain, which creates an offset programmable pre-amplifier that is employed for offset compensation. The continuous-time comparator at the output of the voltage-to-time converter consists of a differential amplifier followed by a common source stage (Figure 3b). The input transistors operate in the subthreshold region for reduced power consumption and to offer a larger input common mode range, and, consequently, increased ramp dynamic range. The coarse current source (Figure 2a) is a pMOS cascode that is controlled by a switch at the gate of the cascade transistor, and the fine current source is a single nMOS device with a series switch.

The time-to-digital converter measures the time interval  $t_m$  from the start of the ramp until the ramp and the input signal crossover point, as illustrated in Figure 2c). i.e. between the *start* signal rising edges, and the comparator generated *stop* signal. The time interval is measured by the TDC, which generates a corresponding digital output. The most simplest TDC realization, a digital counter, require a (very) high counter frequency to realize a high resolution converter. Similarly, delay line circuits, although more power efficient, necessitate large number of stages to measure required periods of time, significantly degrading INL and effective resolution [10]. A TDC combining a low-frequency, low-power counter as a coarse quantizer, and a folding Vernier delay line TDC as a fine quantizer, offer both, a large dynamic range and power efficiency.

# C. Folding Vernier Time-to-Digital Conversion

A coarse time quantizer, designed using a counter (Figure 2a), measures the number of reference clock cycles. The fine resolution quantization of the two-step time-to-digital converter corresponds to a folding Vernier delay TDC. The proposed architecture executes time-to-digital conversion by counting transitions between the stop signal and the next reference clock rising edge after stop signal (Figure 2a and Figure 2c). These transitions are enabled only during the measurement interval.



Figure 3: a) Differential comparator with digitally programmable offset adjustment, b) continuous-time comparator.



Figure 4: Block level of a folding Vernier delay time-to-digital converter.



Figure 5: Simplified overview of a freeze Vernier delay line architecture.



Figure 6: Thermal bit to clock generator.



Figure 7: a) Pulse generator, b) the enable signal generation logic.

The synchronizer block, which consists of three flip-flops in series, ensures that the coarse and fine time measurements are correctly aligned.

A folding Vernier delay TDC is easily scalable to different time resolution and higher number of bits without increasing the area. The architecture achieves minimum time resolution of Vernier delay element (i.e. basic inverter delay), and, due to the folding, offers area-efficient solution. Instead of 32element delay line required for the regular Vernier architecture, the folding feature allows the same Vernier delay stages to be used repeatedly to measure the delay. Additionally, with implemented dynamic control, we sequentially reduce the power required for each conversion.

Block level of a folding TDC is illustrated in Figure 4. Simplified overview of a freeze Vernier delay line architecture is shown in Figure 5. In our design, only four thermal codes are generated at every cycle, and, hence, in the worst case, the measurement cycle is repeated eight times, which is equivalent to 32-bit thermal code with only four Vernier delay elements. The 4-bit thermal codes are converted into 4 pulses with thermal-to-clock generator (Figure 6), and clock a 5-bit counter at the output of TDC. For each thermal bit generated in a freeze Vernier delay line, a corresponding pulse is generated using pulse generator (Figure 7a). The distance between two pulses is controlled with current-starved inverters. Schematic of enable generation logic is illustrated in Figure 7b). For rising edge input, the circuit generates a pulse. The width of the pulse is determined by the nand gate, inverter and the buffer. The enable signal, which decides if either signals *start/stop* or  $v_{1 \text{ start}}/v_{1 \text{ stop}}$  continue into the next cycle, is generated using signals  $v_{t4}$  and  $v_{p4}$ . In the first conversion cycle, enable=0 and vstart/vstop is selected for measurement, otherwise  $v_{1\_start}/v_{1\_stop}$  is selected. The *enable* signal is switched from 1 to 0 when the rising edge of  $v_{start}/v_{I_start}$ crosses the rising edge of  $v_{stop}/v_{1 stop}$ . This particular feature dynamically decides when the conversion is stopped, hence, power/conversion is optimized based on the input. The TDC also offers a feedback to the system with a *ready* signal (inverted signal of *enable*), indicating that it is ready for next conversion.

The 4-bit thermal code is generated with freeze Vernier architecture [12]. In the conventional Vernier architecture, time capture elements or early-late detectors (e.g. a D-register or an arbiter) impose the large load on the circuit. In the freeze Vernier TDC, the time capturing is instead performed by freezing the node voltages of the start line in a linear Vernier delay line, allowing a power- and area-efficient conversion. The freeze Vernier converter consists of inverters and current enabled inverters only. Additionally, the circuit does not require any reset signal - it resets on the falling edge of the stop and the start signal. The delays of the inverters in the freeze Vernier delay elements are controlled using bias current, thus, controlling the resolution of the TDC.

### III. SIMULATION RESULTS

Design simulations on the transistor level were performed at body temperature (37 °C) on Cadence Virtuoso using hardware-calibrated TSMC industrial 90nm CMOS technology. The analog circuits operate with a 1 V supply, while the digital blocks operate at near-threshold from a 0.4 V supply. Spectral signature of ADC is illustrated in Figure 6a). The circuit offers a programmable amplification of 0-18 dB by digitally scaling the voltage-to-time converter. SNDR, SFDR and THD vs. sampling and input frequency is illustrated in Figure 6b) and 6c), respectively. The THD in the range of 40-640 kS/s is above 63 dB within the bandwidth of neural activity of up to 20 kHz; SNDR is above 58 dB, and SFDR more than 64 dB. The maximum simulated DNL is 0.6 LSB and the maximum simulated INL is 0.8 LSB. Variation across slow-slow and fast-fast corner is  $\pm 0.35$  ENOB.



Figure 6: a) Spectral signature of A/D converter, b) SFDR, SNDR and THD vs. sampling frequency with fm=20 kHz and gain set to 18 dB, c) SFDR, SNDR and THD vs. input frequency with  $f_s$ =640 kHz and gain set to 18 dB.

|                         | [1]  | [2]  | [3]  | [5]   | [7]  | [13]*             | [14]* | [15]  | [16]* | [17]*             | [this work]*      |     |
|-------------------------|------|------|------|-------|------|-------------------|-------|-------|-------|-------------------|-------------------|-----|
| Technology              | 0.18 | 0.18 | 0.18 | 0.12  | 0.09 | 0.18              | 0.09  | 0.18  | 0.35  | 0.09              | 0.09              |     |
| Туре                    | SAR  | SAR  | SAR  | Time  | Time | Current           | SAR   | ΣΔ    | SAR   | SAR               | Time              |     |
| $V_{DD}$ [V]            | 0.45 | 1    | 1.8  | 1.2   | 1    | 1.2               | 1     | 1.8   | 3.3   | 0.5               | 1                 |     |
| $f_{S}$ [kS/s]          | 200  | 245  | 120  | 1000  | 1000 | 16                | 1000  | 50    | 16    | 1280              | 640               | 40  |
| ENOB                    | 8.3  | 8.3  | 9.2  | 10    | 7.9  | 8                 | 9.34  | 10.2  | 8.9   | 9.95              | 9.4               | 9.5 |
| FoM [fJ/conv-st]        | 21   | 109  | 382  | 175   | 188  | 132               | 2.87  | 0.22  | 93    | 2.36              | 6.2               | 21  |
| Power [µW]              | 1.35 | 8.4  | 27   | 180   | 14   | 0.45              | 1.79  | 13    | 3.06  | 3                 | 2.7               | 1.6 |
| Area [mm <sup>2</sup> ] | NR   | NR   | NR   | 0.105 | 0.06 | $0.078^{\dagger}$ | NR    | 0.038 | NR    | $0.048^{\dagger}$ | $0.022^{\dagger}$ |     |

TABLE I- COMPARISON WITH PRIOR ART - \*SIMULATION DATA, NR – NOT REPORTED. <sup>†</sup>- ESTIMATED.

The VTC is >9 bit linear across 0.5 V input range. Consequently, ramp rate variation across the input range is limited to 10%, leading to 400 µV nonlinear voltage variation across the output range. The reference clock frequency is 80 MHz, and, subsequently, the counter realizes a 5 bit resolution over the 400 ns TDC input time signal range. The ramp repetition frequency, i.e. sampling frequency of the proposed ADC, is 640 kHz. The simulated ENOB is 9.4 bits over the entire neural spikes input bandwidth. The total A/D converter consumes 2.7 µW, when sampled at 640 kS/s, and 1.6 µW at 40 kS/s, respectively. The area of the folding Vernier TDC design sums up to 10.5  $\mu$ m<sup>2</sup>, the average resolution is 10.05 ps, it operates at a power supply of 0.4 V, and consumes 0.6 µW of power at 640 kS/s sampling rate. Table I summarize the performance and comparison with previous art, with figure of merit calculated according to FoM= $P/(2f_{in} \times 2^{\text{ENOB}})$  [J/conv-st].

# IV. CONCLUSION

In this paper, a low-power, time-based, programmable-gain A/D converter for multi-channel integrated neural implant front-end consuming less than 2.7 µW is presented. The power consumption is scaled with the input voltage level making the circuit suitable for low energy signals. The circuit realized in 90 nm CMOS technology, which embeds both PGA and ADC functionalities, with 6.2 fJ/per conversion exhibits one of the best FoM reported, and occupy an estimated area of only  $0.022 \text{ mm}^2$ .

#### REFERENCES

- [1] D. Han, et al., "A 0.45 V 100-channel neural-recording IC with subµW/channel comsumption in 0.18 µm CMOS," IEEE Trans. Biomed. Circ. Syst., vol. 7, no. 6, pp. 735-746, 2013.
- X. Zou, et al., "A 100-channel 1-mW implantable neural recording IC," [2] IEEE Trans. Circ. Syst.-I: Reg. Papers, vol. 60, no. 10, pp. 2584-2596, 2013.

- C.M. Lopez, et al., "An implantable 455-active-electrode 52-channel CMOS neural probe," IEEE J. Solid-State Circ., vol. 49, no. 1, pp. 248-[3] 261.2014.
- [4] S. Kim, R. Normann, R. Harrison, F. Solzbacher, "Preliminary study of the thermal impact of a microelectrode array implanted in the brain," IEEE Int. Conf. Engin.in Med. and Biol. Soc., pp. 2986-2989, 2006.
- [5] E. Allier, et al., "120 nm low power asynchronous ADC," IEEE Int. Symp. Low Pow. Electr. Des., pp. 60-65, 2005.
- M. Park, M.H. Perrot, "A single-slope 80MS/s ADC using two-step [6] time-to-digital conversion," IEEE Int. Symp. Circ. Syst., pp. 1125-1128, 2009
- S. Naraghi, M. Courcy, M.P. Flynn, "A 9-bit, 14 µW and 0.006 mm<sup>2</sup> [7] pulse position modulation ADC in 90 nm digital CMOS," IEEE J. Solid-State Circ., vol. 45, no. 9, pp. 1870-1880, 2010. A.P. Chandrakasan, et al., "Technologies for ultradynamic voltage
- [8] scaling," Proc. IEEE, vol. 98, no. 2, pp. 191-214, 2010.
- [9] J.K. Fiorenza, et al., "Comparator-based switched-capacitor circuits for scaled CMOS technologies," IEEE J. Solid-State Circ., vol. 41, no. 12, pp. 2658-2668, 2006.
- [10] J.P. Jansson, A. Mantyniemi, J. Kostamovaara, "A CMOS time-todigital converter with better than 10 ps single-shot precision," IEEE J. Solid-State Circ., vol. 41, no. 6, pp. 1286-1296, 2006.
- L. Brooks, H.-S. Lee, "A 12b, 50 MS/s, fully differential zero-crosssing based pipelined ADC," *IEEE J. Solid-State Circ.*, vol. 44, no. 12, pp. [11] 3329-3343, 2009.
- [12] K. Blutman, J. Angevare, A. Zjajo, N. van der Meijs, "A 0.1pJ freeze Vernier time-to-digital converter in 65 nm CMOS," IEEE Int. Symp. Circ. Syst., pp. 85-88, 2014.
- [13] B. Haaheim, T.G. Constandinou, "A sub-1uW, 16kHz current-mode SAR-ADC for single-neuron spike recording," IEEE Int. Symp. Circ. Syst., pp. 2957-2960, 2012.
- [14] T. Rabuske, et al., "A self-calibrated 10-bit 1MSps SAR ADC with reduced-voltage charge-sharing DAC," IEEE Int. Symp. Circ. Syst., pp. 2452-2455, 2013.
- [15] C. Gao, et al., "An ultra-low-power extended counting ADC for large scale sensor arrays," IEEE Int. Symp. Circ. Syst., pp. 81-84, 2014.
- L. Zheng, et al., "An adaptive 16/64 kHz, 9-bit SAR ADC with peak-[16] aligned sampling for neural spike recording," IEEE Int. Symp. Circ. Syst., pp. 2385-2388, 2014.
- [17] Y.-W. Cheng, K.T. Tang, "A 0.5-V 1.28-MS/s 10-bit SAR ADC with switching detect logic," IEEE Int. Symp. Circ. Syst., pp. 293-296, 2015.