# A 128-Channel, 8.9-ps LSB, Column-Parallel Two-Stage TDC Based on Time Difference Amplification for Time-Resolved Imaging

Shingo Mandai and Edoardo Charbon

Abstract—This paper proposes a 128-channel column-parallel two-stage time-to-digital converter (TDC) utilizing a time difference amplifier (TDA) and shows measurement results obtained from an implementation in a 0.35- $\mu$ m CMOS process. The first stage operates as a coarse TDC, the time residue is amplified by a TDA, then converted by the second-stage TDC. As the gain of the time difference amplifier can be adjusted from 8.5 to 20.4, the time resolution of the TDC can be tuned from 21.4 to 8.9 ps. The time resolution variation due to process-voltage-temperature (PVT) effects is  $\pm 5.8\%$  without calibration when the time resolution is 12.9 ps. We propose a calibration method to compensate LSB changes due to the power supply fluctuation and temperature variation.

*Index Terms*—Fluorescence lifetime imaging microscopy (FLIM), positron emission tomography (PET), time-of-flight (TOF), time-to-digital converter (TDC).

## I. INTRODUCTION

■ IME-TO-DIGITAL converters (TDCs) are widely used T IME-TO-DIGITAL converters (12-27) in time-resolved imaging applications, where photon time-of-arrival must be computed with high precision. Examples of such applications are time-of-flight (TOF) 3-D vision, positron emission tomography (PET), fluorescence lifetime imaging microscopy (FLIM), and so on. For these applications, numerous TDCs are often required to increase the frame rate and to acquire time information from as many detectors as possible [1]–[5]. Especially for PET application, it is known that multiple time information can help improve the arrival time estimation of a scintillation from a scintillator to improve timing resolution [6], [7]. A TDC can be implemented in an application specific integrated circuit (ASIC) or in a field programmable gate array (FPGA). While FPGAs enable full flexibility and fast development, ASICs are compact, reliable, and generally operate at lower power for comparable performance [8], [9]. In addition, ASICs can be used in conjunction with a photodetector array, thus minimizing parasitics. In ASIC-TDCs, the time resolution of a basic TDC consisting of a delay chain and D flip-flops (D-FFs) is determined by the unit delay element, which is often an inverter [10]. Of course, a small inverter delay could be achieved with advanced

Manuscript received April 28, 2012; revised July 02, 2012; accepted July 06, 2012. Date of publication August 20, 2012; date of current version October 09, 2012.

The authors are with the Technology University of Delft, Delft 2628 CD, The Netherlands (e-mail: s.mandai@tudelft.nl; e.charbon@tudelft.nl).

Digital Object Identifier 10.1109/TNS.2012.2208761

CMOS processes with 90 or 65 nm feature size, but the price to pay is generally lower quantum efficiencies and narrower sensitivity spectra of the photodetector [11]. Some TDCs achieve subgate delay resolution by employing an interpolated delay chain or ring oscillators [12], [13], Vernier delay lines [14]–[16], pulse-shrinking [17], interpolation [18], and time difference amplification [19], [20]. TDCs employing noise shaping [21], [22] are also reported, however they occupy large area. Another approach consists of using column-level TDCs or pixel-parallel TDCs to raise the frame rate [1], [3]–[5], [23], employing a very simple structure of a delay chain, D-FFs, and counters, but at the expense of time resolution. Subgate resolution in these approaches is generally not used in the interest of a small area.

The main motivation of this paper is to achieve both high timing resolution and small area in column-parallel TDCs for TOF, PET, or FLIM applications. We employ a two-stage architecture, wherein the first stage operates as a coarse TDC (first TDC), the time residue is amplified by a time difference amplifier (TDA), then it is converted by the second-stage TDC (second TDC). The advantage of the two-stage TDC is not only simplicity, but also noise decrease of the second stage TDC by a factor equal to the gain of the TDA, in a similar fashion as in analog-to-digital converters (ADCs).

In Section II, the principle of the proposed column-parallel TDC is presented. We describe the architecture, the circuit design, and the operation of the TDA and the proposed TDC in Section III. In Section IV, the measurement results, the performance of the TDA, and the proposed TDC are shown, including environmental effects due to the power supply and temperature fluctuations. The time resolution variation in the entire TDC array is also measured. Finally, the results are discussed in Section V.

# II. PROPOSED TDC

# A. TDC Structure

A two-stage ADC and a two-stage TDC are similar conceptually since they both have double conversion and amplification of the residue, as shown in Fig. 1, whereas a two-stage TDC calculates the time residue instead of the voltage residue. The LSB of the two-stage TDC,  $LSB_{two\_stage}$ , is calculated using the LSB of each stage,  $LSB_{one\_stage}$ , and the gain of the TDA, *G*, as follows:

$$LSB_{two\_stage} = \frac{LSB_{one\_stage}}{G}.$$
 (1)



Fig. 1. Concept of a two-stage ADC and a two-stage TDC based on time difference approximation.

Time resolution is thus proportional to G, and it can be arbitrarily increased. The difficulty of the TDC, compared to the ADC, is the amplification of the time difference. Unlike an ADC, the time residue of a TDC cannot be easily stored on a capacitance. In [19], the authors implemented a large number of TDAs to calculate all possible time residues. However, the area required by such an approach can be large. The proposed TDC consists of a first-stage TDC (first TDC), a second-stage TDC (second TDC), and the TDA. We utilize only one TDA to reduce the area employing the synchronizer described in the next section. The structure of the first TDC and the second TDC are almost the same.

# B. Column-Parallel TDC

Fig. 2 shows a block diagram of the column-parallel TDC with readout module. The column-parallel TDC consists of 128 TDCs and two dummies at both ends of the array. Each TDC is connected to a single-photon avalanche diode (SPAD) [1] to not only measure differential nonlinearity (DNL) and integral non-linearity (INL) using density tests, but also to capture line-based 3-D images using TOF and lifetime images for FLIM applications. The readout multiplexers are designed between the first TDC and the TDA, and the back of the second TDC.

# III. DETAILED STRUCTURE AND CIRCUIT DESIGN

# A. First TDC and Second TDC Structure

Fig. 3(a) shows the schematic of the first TDC in the proposed two-stage TDC. The first TDC contains a dual-rail voltage controlled oscillator (VCO), which is free-running after EN is issued, a phase detector, a VCO cycle counter, and a synchronizer, which calculates the time residue from the first conversion. The AND gate is used for gating the oscillation of the VCO just after the first conversion finishes to reduce power consumption. Delay  $\Delta t$  was added for reducing the time difference between STOP and SYNC to fit to the input range of the TDA. The second TDC is almost identical, except for the lack of a synchronizer and a 5-bit counter instead of a 6-bit counter. Fig. 3(b) and (c) shows a differential inverter in the VCO and the comparator in the phase detector. The delay of the differential inverter depends on power supply, so the frequency of the



Fig. 2. Block diagram of the proposed column-parallel TDC.



Fig. 3. Block diagram of the first TDC. (a) Schematic of the first TDC. (b) Differential inverter in the VCO. (c) Comparator in the phase detector.

VCO is constant with the constant power supply. Fig. 4 shows the timing diagram for generating the time residue from the first TDC and the relation between the time residue and time difference between START and STOP. Fig. 4(a)–(c) shows three examples of possible timing diagram. The counter counts the pulses up to the rising edge of signal STOP. The first D-FF in the synchronizer is triggered by the rising edge, while the second D-FF is triggered by the falling edge by the inverter between the first D-FF and the second D-FF. After the STOP signal is issued, STOPOUT is generated after a fixed delay,  $\Delta t$ . Then, STOPOUT and SYNC are given to the TDA as an input. Thus, the range of the time residue of the first TDC is from 0.5 to 1.5 VCO clock cycles, which is noted as  $T_{vco}$ . Fig. 4(d) shows the relation between the time residue and the START STOP time



Fig. 4. Timing diagram for the time residue from the first TDC. (a)–(c) Examples of the timing diagram to calculate the time residue. (d) Relation between START and STOP time difference and the time residue.



Fig. 5. TDA structure. (a) Delay chain. (b) Current mirror for the TDA control signal. (c) Fast delay cell. (d) Slow delay cell.

difference. The time residue waveform is a sawtooth wave, and the maximum of the time residue is smaller than the maximum input time difference of the TDA.

# B. TDA Structure

Fig. 5 shows the basic structure of the TDA [24]. Fig. 5(a) shows a delay chain consisting of a variable delay cell. "F" and "S" mean a fast delay cell and a slow delay cell. The two input signals propagate along the chains at two different speeds. After the two signals cross over, fast delay cells become slow, and slow delay cells become fast. The input time difference is amplified by the ratio of F and S, which is the TDA gain. Fig. 5(b)–(d) shows the schematic of the bias generator, the fast delay cell, and the slow delay cell. The fast delay cell and the slow delay cell is switched digitally by activating or deactivating transistors M4 and M2. The gain of a TDA is controlled by bias voltage,  $N_{\rm ctrl}$ , that also controls  $P_{\rm ctrl}$  to change the current flowing though delay cells. When the  $N_{\rm ctrl}$  is larger than the threshold voltage,

the gain is decided by the ratio of the on-current of a fast and a slow delay cell as follows:

$$G = 1 + \frac{\beta_{\rm f} (V_{\rm dd} - V_{\rm t})^2}{\beta_{\rm s} (V_{N_{\rm ctrl}} - V_{\rm t})^2}$$
(2)

where  $\beta_{\rm f}$  and  $\beta_{\rm s}$  are the geometry and technology-dependent parameters for transistors M2 and M3 or M4 and M5;  $V_{\rm t}$  is the threshold voltage. When we define the gate delay of a fast delay cell,  $T_{\rm s}$ , and the number of delay cells in a chain, N, the input range  $T_{\rm tda\_range}$  is

$$T_{\rm tda\_range} = N \times T_{\rm s}.$$
 (3)

The TDA has 16 delay cells in each chain in this design, but it is possible to increase the input range by employing larger numbers of delay cells.

# C. Calculation of TDC Digital Code

The LSB of the proposed TDC,  $LSB_{total}$  is calculated by using the gate delay  $T_d$  of a differential inverter in the VCO as follows:

$$LSB_{total} = \frac{T_d}{G}.$$
 (4)

 $T_{\rm d}$  can be expressed in terms of the frequency of the VCO,  $f_{\rm vco}$ , as follows:

$$T_{\rm d} = \frac{1}{f_{\rm vco} \times 8} \tag{5}$$

where the factor 8 is a consequence of the fact that the VCO is composed of four differential inverters. Hence, the  $\rm LSB_{total}$  becomes

$$LSB_{total} = \frac{1}{f_{vco} \times 8 \times G}.$$
 (6)

The dynamic range DR (bit length) is calculated using the maximum counter value of the first TDC,  $C_{1\text{st}\_\text{max}}$ . Suppose that the frequency of each VCO in the first TDC and the second TDC is same, then

$$DR = \log_2(8 \times C_{1st\_max} \times G) \tag{7}$$

Fig. 6 shows the output code from the first and the second TDC. The time residue is amplified by the TDA and then digitized by the second TDC.  $N_{\text{max}}$  and  $N_{\text{min}}$  are the maximum and the minimum digital value by the second TDC when the time residue is maximum and minimum, respectively.  $N_{\text{th}}$  is a digital value for the time residue when the counter value of first TDC just changes. PDM is the phase of the first TDC when the time residue changes from minimum to maximum. The TDA gain, G, and LSB<sub>total</sub> are calculated using  $N_{\text{max}}$  and  $N_{\text{min}}$  as follows:

$$G = \frac{N_{\max} - N_{\min}}{8} \tag{8}$$

so the LSB<sub>total</sub> is rewritten from (6) with  $N_{\text{max}}$  and  $N_{\text{min}}$  as follows:

$$LSB_{total} = \frac{1}{f_{vco} \times (N_{max} - N_{min})}.$$
 (9)



Fig. 6. Digital code calculation using outputs from the first TDC and the second TDC.

When the counter value and the phase from the first TDC are defined as  $C_{1\text{st}}$  and  $P_{1\text{st}}$ , the counter value and the phase from the second TDC are defined as  $C_{2\text{nd}}$  and  $P_{2\text{nd}}$ , the row digital code from the second TDC,  $D_{2\text{nd}}$ , is calculated as follows:

$$D_{2nd} = 8 \times C_{2nd} + P_{2nd}.$$
 (10)

When the output codes from the first TDC and the second TDC are summarized,  $D_{2nd}$  should be modified to  $D_{2nd}$ -mod because the counter and the time residue from the first TDC are not synchronized. Therefore,  $D_{2nd}$ -mod is calculated as follows:

$$D_{2nd\_mod} = \begin{cases} N_{max} - D_{2nd} \\ + (N_{th} - N_{min} + 1), & (P_{1st} > PDM - 1 \& D_{2nd} > N_{th}) \\ N_{th} - D_{2nd}, & (else). \end{cases}$$
(11)

Finally, the TDC code,  $D_{total}$ , is calculated as follows:

$$D_{\text{total}} = C_{1\text{st}} \times (N_{\text{max}} - N_{\text{min}}) + D_{2\text{nd}\_\text{mod}}$$
(12)

where  $N_{\text{max}}$ ,  $N_{\text{min}}$ ,  $N_{th}$ , and PDM should be calibrated in advance.

# D. Pixel Circuit

Fig. 7 shows the schematic of one pixel. A separate pixel is connected to each TDC. The pixel consists of a 6- $\mu$ m-diameter SPAD, a quenching transistor, a reset transistor, an off transistor, and a buffer to pulse-shape the output from the SPAD. A SPAD is a p+/deep n-well junction using p-well as a guard ring [1]. The breakdown voltage is 19.2 V [25]. The pixel circuit has both passive recharge and active recharge circuitries, controlled by signals VB and RST. The SPAD can be disabled by the OFF signal.

# E. TDC Operation

Fig. 8 shows the timing diagram of the two-stage TDC. After RST is applied to reset the VCOs, the counters, and the phase detectors, the TDC is ready for the START signal. Upon arrival of the START signal, the VCO in the first stage TDC begins oscillating and the counter counts the number of cycles of the VCO.



Fig. 7. Pixel structure.



Fig. 8. Timing diagram.

SYNC, which is an output from synchronizer in Fig. 3, rises as it is synchronized with the second falling edge of the VCO after the STOP arrives. The inputs for the TDA are STOP and SYNC, and the time difference between STOP and SYNC is less than 1.5 clock cycles of the VCO because the synchronizer uses a dual-edge of the output from the VCO to reduce the maximum time residue. Thus, the time residue of the first TDC is larger than 0.5 clock cycles but smaller than 1.5 clock cycles. After amplifying the time residue, START2ND and STOP2ND operate in the second TDC as START and STOP in the first TDC. After completing the conversion, the digital data are latched by LATCH and read out by the first TDC output and the second TDC output multiplexer to the shift-register outside the TDC array after the address of the TDC is input. The output from the multiplexer is stored in the shift-register and read out 1 bit by 1 bit outside the chip. All control signals are generated by an FPGA and sent to the host PC through the FPGA. The host PC calculates the final digital codes. The conversion time is 320 ns, but it takes 102  $\mu$ s to read out all data because of the use of a 1-bit shift-register for readout.

## **IV. MEASUREMENTS**

#### A. Chip Implementation and Measurement Setup

We have designed and fabricated the proposed 128-channel column-parallel TDC using a high-voltage 0.35- $\mu$ m CMOS process [26]. Fig. 9 shows the microphotograph. Besides 128 TDCs, two dummy TDCs were also added to ensure edge



Fig. 9. Chip microphotograph.



Fig. 10. TDA input-output measurement result.

uniformity, and one SPAD [1] is connected to each TDC, thus minimizing parasitics and propagation time between SPAD and TDC. The TDC, including the readout multiplexer, occupies  $24 \times 1100 \ \mu\text{m}^2$ , of which the first and the second TDC occupy  $24 \times 130 \ \mu\text{m}^2$ , the TDA  $24 \times 550 \ \mu\text{m}^2$ , and the readout multiplexer  $24 \times 300 \ \mu\text{m}^2$ . The total size of the TDC array, including dummies, is  $3.12 \times 1.1 \ \text{mm}^2$ . The printed circuit board (PCB) for the TDC array chip is connected to a Xilinx ML507 board. The embedded Virtex 5 FPGA generates control signals and receives outputs from the TDC array chip, and then communicates with the PC used for data acquisition over Ethernet.

# B. TDA Characterization

Fig. 10 shows the measurement of the relation between the input time differences and the output time differences in a TDA as a function of bias voltage. The gain changes from 8.5 to 21.6 as the TDA bias voltage,  $N_{\rm ctrl}$ , changes from 1.25 to 0.95 V following (2). Fig. 11(a) shows the summary of TDA gain in each  $N_{\rm ctrl}$  with three power supply voltages: 3.1, 3.3, and 3.5 V. When the power supply voltage changes, by  $\Delta V$ , the gain changes as follows:

$$G(T) = 1 + \frac{\beta_f (V_{dd} + \Delta V - V_t)^2}{\beta_s (V_{Nctrl} - V_t)^2}.$$
 (13)

Fig. 11(b) shows the jitter of a single TDC introduced by the TDA. This measurement was carried out by iterating the same measurements using the identical input time difference between the external electrical START and the STOP signal generated by the FPGA. This measurements include the input signals jitter and jitter caused by the first TDC and the second TDC. The measured jitter indicates the SPAD's single-shot precision, which



Fig. 11. (a) TDA gain changes by  $N_{\rm ctrl}.$  (b) TDC jitter changes by  $N_{\rm ctrl}.$ 

shows the timing accuracy for a single measurement. The measurement shows that the jitter does not increase even if the gain increases, though the jitter is worse when decreasing the power supply voltage from nominal (3.3 V) at 0.95 V  $N_{\rm ctrl}$ . This result means the jitter of input signal or the jitter caused by the first TDC is small, if compared to the intrinsic jitter of TDA itself.

# C. TDC Characterization

We measured one TDC to characterize the LSB, DNL, and INL as a function of TDA bias voltage,  $N_{ctrl}$ , and power supply at room temperature. Fig. 12 shows the measurement results of the density test to acquire DNL and INL when the input range is 29.3 ns; the SPAD is used to generate random START signals for the TDC [27]. According to the DNL/INL measurement, the number of digital codes increases by reducing  $N_{\rm ctrl}$ . The LSB is thus smaller due to higher gain of the TDA as  $N_{\rm ctrl}$  decreases, varying from 21.4 to 8.9 ps. The spurs of the DNL measurement occur when the time residue of the first TDC is close to minimum or maximum. The short period fluctuation of the INL measurement results from the TDA gain fluctuation against the input time difference to the TDA as shown in Fig. 10. Fig. 13 shows peak-to-peak INL error globally (whole input range) and locally (shortly fluctuated period from the one maximal to the next maximal) as a function of  $N_{\text{ctrl}}$ . The global peak-to-peak error increases as  $N_{\rm ctrl}$  decreases according to the LSB, but the global peak-to-peak error does not vary appreciably. The reason is that the global error comes from the first TDC nonlinearity due to dynamical power supply noise. Therefore, we can see that the INL error of the TDA is independent of its gain.



Fig. 12. Measured DNL and INL in  $N_{\rm ctrl}$  of the TDA: (a) 0.95, (b) 1.05, (c) 1.15, and (d) 1.25 V.



Fig. 13. Global and local INL error in each  $N_{\rm ctrl}$  from the point of view of LSB and picoseconds (ps): Global INL error means a peak-to-peak value in the whole TDC input range, and local INL error means a peak-to-peak value in a periodic INL fluctuation.

# D. Environmental Tolerance

As the TDA gain is dependent on power supply fluctuations, the TDC LSB also suffers from power supply and temperature fluctuations. Fig. 14 details such dependence. When the power supply voltage increases, both the frequency of the VCO and the TDA gain increase, so the LSB will decrease according to (6). Fig. 14(a) shows the measurement results when the power supply voltage changes from 3.1 to 3.5 V: +19.2%/ - 13.7%



Fig. 14. LSB versus  $N_{\rm ctrl}$ : (a) with  $\pm 0.2$  V power supply fluctuation; (b) with temperature fluctuation from 0°C to 60°C with 20°C step. (c) Local INL error varies in temperature.

and +19.3%/-14.8% in the  $\pm 0.2$ -V power supply voltage range when  $N_{\text{ctrl}}$  is 0.95 and 1.25 V, respectively. These variations can be reduced by calibrating the VCO frequency and the TDA gain using a lookup table made in advance or placing the VCO and TDA in phase-locked loops. For the calibration of the VCO frequency and the TDA gain by lookup table, we show the digital code distribution of 4 million hits when the input time difference is fixed to be 40 ns with 0.95 V of TDA bias voltage as shown in Fig. 15. Before calibration, the digital codes for each power supply voltage vary more than 15%, as shown in Fig. 15(a). When the digital codes are compensated by the lookup table, then the variation is about 1% as shown in Fig. 15(b). To reduce power supply fluctuation, one can also employ a CML logic delay cell instead of a CMOS inverter-type delay cell, as the TDA variable delay cell will reduce the influence of power supply fluctuations. This is due to the fact that the tail current controls the delay of fast and slow delay cells, hence the ratio of the two delays is decided independently of the



Fig. 15. TDC output code at 40 ns input time difference (a) without calibration and (b) with calibration.

power supply. The VCO could also be made tolerant to power supply fluctuation as shown in [28].

When one changes the temperature from  $60^{\circ}$ C to  $0^{\circ}$ C, the VCO frequency increases 5.6% faster. By the shift of the VCO frequency, the input range of the TDA becomes insufficient to cover the whole time residue from the first TDC. It means that the TDA cannot amplify time residue properly, so the gain also changes. As a result, the changes of the frequency are offset by the change of the TDA gain. Fig. 14(b) shows the temperature effects of the LSB from  $0^{\circ}$ C to  $60^{\circ}$ C. The local INL error increases 36% when the temperature changes from  $0^{\circ}$ C to  $60^{\circ}$ C as shown in Fig. 14(c). To cover the whole time residue, the number of delay cells in the TDA should be increased to increase the input range, and then the lookup table method can be utilized to compensate the LSB shift by temperatures.

### E. Uniformity Measurement Results

We measured the LSB variation on all 128 TDCs by means of the density test using SPADs for 1.25- and 1.15-V  $N_{\rm ctrl}$ ; the gain is 8.5 and 12, so the LSB should be 21.4 and 17.1 ps, respectively. Fig. 16 shows the LSB variation. When  $N_{\rm ctrl}$  is 1.25 V, the average of the LSB is 12.93 with 0.75 ps standard deviation corresponding to 5.8% LSB, and when  $N_{\rm ctrl}$  is 1.15 V, the average LSB is 7.55 with 0.51 ps standard deviation corresponding to 6.8% LSB. The variation is due to the frequency variation of both VCOs in the first and the second TDCs, and of the gain of the TDA. The IR drop of the power supply voltage and  $N_{\rm ctrl}$  are also another reason for the LSB variation because many TDCs operate at the same time, thus the actual LSB is lower than the predicted LSB. This behavior implies that the IR drop of the power supply voltage affects  $N_{ctrl}$  because the  $V_{ctrl}$  is generated by using the power supply voltage. From Fig. 14(a), we can estimate the IR drop of  $N_{\rm ctrl}$  to be about 200 mV because



Fig. 16. LSB variation in 128 column-parallel TDC in 1.25-V  $N_{\rm ctrl}$  and 1.15-V  $N_{\rm ctrl}$ 

the LSB by this uniformity measurement at 1.25- and 1.15-V  $N_{\rm ctrl}$  becomes close to the LSB characterized in a single TDC measurement at 1.05- and 0.95-V  $N_{\rm ctrl}$ , respectively. However, The effects of the IR drop can be easily compensated for offline since the number of TDCs that fired is known and a simple model of the IR drop can be built. However, they are negligible when the TDCs operate in photon-starved mode.

## V. CONCLUSION

We have proposed a 128-channel column-parallel two-stage TDC utilizing a TDA implemented in a 0.35- $\mu$ m CMOS process. The first stage operates as a coarse TDC, the time residue that is not converted in the first stage is amplified by a TDA, then converted by the second-stage TDC. As the gain of the time difference amplifier can be adjusted from 8.5 to 20.4, the time resolution of the TDC can be tuned from 21.4 to 8.9 ps. The time resolution variation due to PVT effects is  $\pm 5.8\%$  without calibration when the time resolution is 12.9 ps. We also shows the calibration technique for the VCO frequency and the TDA gain shift by power supply fluctuation, resulting in only 1% variation from 3.1 to 3.5 V.

#### ACKNOWLEDGMENT

The authors would like to thank Dr. Y. Maruyama, Dr. H.-J. Yoon, and M. Fishburn of TU Delft for their technical support during the design, as well as for many useful collaborative discussions.

#### REFERENCES

- C. Niclass, C. Favi, T. Kluter, M. A. Gersbach, and E. Charbon, "A 128 × 128 single-photon image sensor with column-level 10-bit time-todigital converter array," *IEEE J. Solid-State Circuits*, vol. 43, no. 12, pp. 2977–2989, Dec. 2008.
- [2] M. Gersbach, Y. Maruyama, R. Trimananda, M. W. Fishburn, D. Stoppa, J. A. Richardson, R. Walker, R. Henderson, and E. Charbon, "A time-resolved, low-noise single-photon image sensor fabricated in deep-submicron cmos technology," *IEEE J. Solid-State Circuits*, vol. 47, no. 6, pp. 1394–1407, May 2012.

- [3] D. Stoppa, F. Borghetti, J. Richardson, R. Walker, L. Grant, R. K. Henderson, M. Gersbach, and E. Charbon, "A 32 × 32-pixel array with in-pixel photon counting and arrival time measurement in the analog domain," in *Proc. IEEE ESSCIRC*, 2009, pp. 204–207.
- [4] J. Richardson, R. Walker, L. Grant, D. Stoppa, F. Borghetti, E. Charbon, M. Gersbach, and R. K. Henderson, "A 32 × 32 50 ps resolution 10 bit time to digital converter array in 130 nm CMOS for time correlated imaging," in *Proc. IEEE CICC*, 2009, pp. 77–80.
- [5] C. Veerappan, J. Richardson, R. Walker, D. U. Li, M. Fishburn, Y. Maruyama, D. Stoppa, F. Borghetti, M. Gersbach, R. K. Henderson, and E. Charbon, "A 160 × 128 single-photon image sensor with on-pixel 55 ps 10 b time-to-digital converter," in *IEEE ISSCC Dig. Tech. Papers*, 2011, pp. 312–314.
- [6] M. W. Fishburn and E. Charbon, "System trade-offs in gamma-ray detection utilizing SPAD arrays and scintillators," *IEEE Trans. Nucl. Sci.*, vol. 57, no. 5, pp. 2549–2557, Oct. 2010.
- [7] S. Seifert, H. T. van Dam, and D. R. Schaart, "The lower bound on the timing resolution of scintillation detectors," *Phys. Med. Biol.*, vol. 57, pp. 1797–1814, 2012.
- [8] E. Bayer and M. Traxler, "A high-resolution (<10 ps rms) 48-channel time-to-digital converter (TDC) implemented in a field programmable gate array (FPGA)," *IEEE Trans. Nucl. Sci.*, vol. 58, no. 4, pp. 1547–1552, Aug. 2011.
- [9] J. Wang, S. Liu, L. Zhao, X. Hu, and Q. An, "The 10-ps multitime measurements averaging TDC implemented in an FPGA," *IEEE Trans. Nucl. Sci.*, vol. 58, no. 4, p. 2011-20 182, Aug. 2011.
- [10] R. B. Staszewski, S. Vemulapalli, P. Vallur, J. Wallberg, and P. T. Balsara, "1.3 v 20 ps time-to-digital converter for frequency synthesis in 90-nm CMOS," *IEEE Trans. Circuits Syst. II*, vol. 53, no. 3, pp. 220–224, Mar. 2006.
- [11] M. A. Karami, M. Gersbach, H. J. Yoon, and E. Charbon, "A new single-photon avalanche diode in 90 nm standard CMOS technology," *Opt. Exp.*, vol. 18, no. 21, 2010.
- [12] S. Henzler, S. Koeppe, W. Kamp, H. Mulatz, and D. Schmitt-Landsiedel, "90 nm 4.7 ps-resolution 0.7-LSB single-shot precision and 19 pJ-per-shot local passive interpolation time-to-digital converter with on-chip characterization," in *IEEE ISSCC Dig. Tech. Papers*, 2008, pp. 548–635.
- [13] J. P. Jansson, A. Mantyniemi, and J. Kostamovaara, "A CMOS time-todigital converter with better than 10 ps single-shot precision," *IEEE J. Solid-State Circuits*, vol. 41, no. 6, pp. 1286–1296, Jun. 2006.
- [14] J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit vernier ring time-to-digital converter in 0.13 μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 830–842, Apr. 2010.
- [15] L. Vercesi, A. Liscidini, and R. Castello, "Two-dimensions vernier time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 8, pp. 1504–1512, Aug. 2010.

- [16] N. Xing, J. K. Woo, W. Y. Shin, H. Lee, and S. Kim, "A 14.6 ps resolution, 50 ns input-range cyclic time-to-digital converter using fractional difference conversion method," *IEEE Trans. Circuits Syst. 1*, vol. 57, no. 12, pp. 3064–3072, Dec. 2010.
- [17] P. Chen, S. L. Liu, and J. Wu, "A CMOS pulse-shrinking delay element for time interval measurement," *IEEE Trans. Circuits Syst. II*, vol. 47, no. 9, pp. 954–958, Sep. 2000.
- [18] A. Mantyniemi, T. Rahkonen, and J. Kostamovaara, "A CMOS time-to-digital converter based on a cyclic time domain successive approximation interpolation method," *IEEE J. Solid-State Circuits*, vol. 44, no. 11, pp. 3067–3078, Nov. 2009.
- [19] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps resolution coarse-fine time-todigital converter in 90 nm CMOS that amplifies a time residue," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 769–777, Apr. 2008.
- [20] S. Mandai, T. Iizuka, T. Nakura, M. Ikeda, and K. Asada, "Time-todigital converter based on time difference amplifier with non-linearity calibration," in *Proc. IEEE ESSCIRC*, 2010, pp. 266–269.
- [21] M. Z. Straayer and M. H. Perrott, "A multi-path gated ring oscillator TDC with first-order noise shaping," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, Apr. 2009.
- [22] Y. Cao, P. Leroux, W. D. Cock, and M. Steyaert, "A 1.7 mW 11 b 1-1-1 mash  $\Delta\Sigma$  time-to-digital converter," in *IEEE ISSCC Dig. Tech. Papers*, 2011, pp. 480–482.
- [23] M. Gersbach, Y. Maruyama, E. Labonne, J. Richardson, R. Walker, L. Grant, R. K. Henderson, F. Borghetti, D. Stoppa, and E. Charbon, "A parallel 32 × 32 time-to-digital converter array fabricated in a 130 nm imaging CMOS technology," in *Proc. IEEE ESSCIRC*, 2009, pp. 196–199.
- [24] T. Nakura, S. Mandai, M. Ikeda, and K. Asada, "Time difference amplifier using closed-loop gain control," in *IEEE VLSI Symp. Dig. Tech. Papers*, 2009, pp. 208–209.
- [25] M. W. Fishburn, Y. Maruyama, and E. Charbon, "Reduction of fixedposition noise in position-sensitive single-photon avalanche diodes," *IEEE Trans. Electron Devices*, vol. 58, no. 8, pp. 2354–2361, Dec. 2011.
- [26] S. Mandai and E. Charbon, "A 128-channel, 9 ps column-parallel twostage TDC based on time difference amplification for time-resolved imaging," in *Proc. IEEE ESSCIRC*, 2011, pp. 119–122.
- [27] C. Favi and E. Charbon, "A 17 ps time-to-digital converter implemented in 65 nm FPGA technology," in *Proc. ACM/SIGDA Int. Symp. FPGA*, 2009, pp. 113–120.
- [28] T. Wu, K. Mayaram, and U. Moon, "An on-chip calibration technique for reducing supply voltage sensitivity in ring oscillators," *IEEE J. Solid-State Circuits*, vol. 42, no. 4, pp. 775–783, Apr. 2007.