# Fundamentals of a Scalable Network in SPADnet-based PET Systems

Martijn Bijwaard, Chockalingam Veerappan, Student Member, IEEE, Claudio Bruschini, Member, IEEE, and Edoardo Charbon, Senior Member, IEEE,

*Abstract*—In SPADnet we have advocated the use of standardized photonic modules for modular assembly of PET systems. In this paper, we tackle the scalability problem, starting from synchronization. Since the photonic modules in a ring must ensure tens of picoseconds in timing accuracy, it is essential that the synchronization of each module be accurate, irrespective of the number of modules in a ring, the ring's size, and the number of rings in the system. We propose a hybrid solution, where a hard-wired clock synchronization network is combined with a network-based clock offset estimator. This combination enables scalability while maintaining high precision. A novel least squares synchronization algorithm is optimized and implemented in hardware equipped with a delay line FPGA TDC, allowing picosecond clock synchronization. The solution is verified in an 8 node system.

# I. INTRODUCTION

T HIS paper focuses on the scalability in PET systems built from a number of standardized photonic modules that are organized in a single or in multiple rings, as required, for example, in clinical PET.



Fig. 1: SPADnet photonic module, showing the FPGA processing and communication board on the top, and the sensor tile on the bottom (facing downwards).

To achieve scalability, one must first use a photonic module directly capable of converting incoming gamma photons into

This work is supported in part by the European Community within the FP7 SPADnet project.

M. Bijwaard was with Delft University of Technology, Delft, The Netherlands, at the time of this work.

C. Veerappan and E. Charbon are with Delft University of Technology, Delft, The Netherlands.

C. Bruschini is with Ecole Polytechnique Federale de Lausanne, Delft, The Netherlands.

Disclaimer: This publication reflects only the authors views. The European Community is not liable for any use that may be made of the information contained herein. timing, position and energy information organized in a digital format, as in [1], [2], [3]. One such photonic module built in SPADnet project [4] is presented in Figure 1. In [5] we proposed a sensor network-based approach to enable data acquisition scalability. In this paper we propose a networkbased synchronization scheme that incorporates timing synchronization between photonic modules into the sensor network architecture [6]. The proposed synchronization scheme can in-principle scale to different PET modalities.

A network-based synchronization scheme for PET applications is essential in realistic implementations of SPADnetbased systems [6], due to the potentially large number of rings of photonic modules that can be reconfigured at will with no or minimal changes to the overall architecture of the scanner.

Multiple solutions for synchronization in wireless sensor networks can be readily found in the literature, though, in general, these solutions synchronize clocks in the range of milliseconds up to microseconds. SPADnet however demands a more precise synchronization, in the order of tens of picoseconds. A class of purely network-based clock synchronization algorithms as presented in [7] was therefore examined.

We found these approaches to be largely inadequate to our goals. We thus propose an alternative that offers the advantages of purely network-based synchronization, in terms of scalability, while at the same time ensuring the desired timing accuracy. The proposed approach combines a hardwired clock synchronization network with a network-based clock offset estimator, effectively measuring and correcting for errors in synchronization.

# II. PURELY NETWORK-BASED CLOCK SYNCHRONIZATION

Purely network-based clock synchronizations works by executing message exchanges between sensor nodes, while recording transmission and arrival times. The investigated clock synchronization algorithms [7] are indeed capable of reaching the theoretical limit.

In practice, however, the performance is limited by the "long term" (> 1s) stability of the network modules' local clock synthesizers. The SPADnet modules are equipped with low-cost quartz oscillators, which achieve a best Allan variance [8] of 22 ns for a coherence time region of  $\tau = 2 - 16s$  (Figure 2). For SPADnet this is obviously unacceptable, since the clocks are required to be synchronized in the range of picoseconds. Alternative clock synthesizers such as using rubidium or cesium can provide satisfactory stability. These solutions are however far too bulky for our application.



Fig. 2: Clock stability measurement results with purely network-based clock synchronization, featuring 3 operation regimes.

### **III. HYBRID CLOCK SYNCHRONIZATION SOLUTION**

Currently SPADnet relies on a daisy-chained clock network which, while sufficient for smaller implementations, does not scale well to larger systems, as the user needs to manually recalibrate the clock phases. However, since all the modules run on the same clock, the long-term clock stability is ensured. Thus, clock correction algorithms can be used to estimate all phase offsets, instead of measuring and correcting them manually. In this hybrid approach, scalability is guaranteed while at the same time ensuring a timing dephasing of a few tens of picoseconds over the long periods during which the scan is performed.

We selected a pairwise least squares solutions (PLS) [7], since it is more suitable to be implemented in hardware, due to its low complexity compared to other approaches.

### A. Hardware implementation

The PLS algorithm, finds an estimate for the clock skew, phase and distance between pairs of nodes [7]. Note that the algorithm is optimized to be efficiently implemented in hardware.

with

$$\hat{\boldsymbol{\theta}}_{j} = \left(\mathbf{A}_{ji}^{T}\mathbf{A}_{ji}\right)^{-1}\mathbf{A}_{ji}^{T}\mathbf{t}_{ij}, \qquad ($$

1)

$$\mathbf{A}_{ji} = [\mathbf{t}_{ji} \ \mathbf{1}_{2k} \ \mathbf{e}]^T \in \mathbb{R}^{2k \times 3},$$
$$\boldsymbol{\theta}_j = [\alpha_j \ \beta_j \ \tau_{ij}]^T \in \mathbb{R}^{3 \times 1},$$

where  $\mathbf{t}_{ij}$ ,  $\mathbf{t}_{ji} \in \mathbb{R}^{2kx1}$  are timestamps recorded at node *i* and *j* for all the k two-way communications respectively,  $\tau_{ij}$  the pair-wise distance between node *i* and j,  $\mathbf{e} = [-1, +1, -1, \cdots, +1]^T \in \mathbb{R}^{2k \times 1}$ ,  $\mathbf{1}_{2k} \in \mathbb{R}^{2k \times 1}$  is a vector of ones of length 2k, and  $\alpha_j, \beta_j$  are the correction parameters for clock skew  $\omega$  and clock offset  $\phi$  of node *j*. Thanks to the use of the distributed clock network there will not be any clock drift in the system. Therefore, we do not need to find an estimate for the clock skew  $\alpha_j$ . With  $\alpha_j = 1$  Equation (1) can be simplified to:

$$\hat{\boldsymbol{\theta}'}_{j} = \left(\mathbf{A}_{ji}^{T}\mathbf{A}_{ji}^{\prime}\right)^{-1}\mathbf{A}_{ji}^{\prime T}(\mathbf{t}_{ji} - \mathbf{t}_{ij}), \qquad (2)$$

where

$$\mathbf{A}'_{ji} = \left[\mathbf{1}_{2k} \ \mathbf{e}\right]^T \in \mathbb{R}^{2k \times 2},\\ \boldsymbol{\theta}'_j = \left[\beta_j \ \tau_{ij}\right]^T \in \mathbb{R}^{2 \times 1}.$$

Working out Equation (2), it reduces the least square problem to

$$\hat{\boldsymbol{\theta}'}_{j} = \left(\mathbf{A}_{ji}'^{T}\mathbf{A}_{ji}'\right)^{-1}\mathbf{A}_{ji}'^{T}\mathbf{t}_{ij} = \begin{pmatrix} \frac{-1}{2k} & \frac{-1}{2k} \\ \frac{-1}{2k} & \frac{-1}{2k} \\ \vdots & \vdots \\ \frac{1}{2k} & \frac{-1}{2k} \\ \frac{1}{2k} & \frac{-1}{2k} \end{pmatrix} (\mathbf{t}_{ji} - \mathbf{t}_{ij}). \quad (3)$$

If the number of timestamp exchanges is  $k = 2^i$ , for a given *i*, solving Equation (3) only requires 2k logical shifts, 3k additions and 2k subtractions, which can implemented in hardware with ease.

## B. High resolution timestamping

The resolution of the phase estimation algorithm depends on the timestamping precision. To enable high precision, a single delay line FPGA time-to-digital converter (TDC) was used, along the lines of what presented in [9]. After implementation, the resolution of the TDC was measured to be 18.4 ps. The range of the TDC is extended by using a coarse counter.

# IV. EXPERIMENTAL EVALUATION

The proposed synchronization scheme has been tested in an 8 node system, using actual FPGA boards of SPADnet modules. The estimation stability of the phase estimator depends on the amount of message exchanges. A stability of 157 ps (standard deviation) was reached for k = 4096 two-way message exchanges between two adjacent nodes (Figure 3). The stability can be improved by further increasing the number of exchanges. The smallest phase offset that can be estimated was measured to be 18.4 ps, matching the time resolution of the TDC (Figure 4).

The complete 8 node test setup, shown in Figure 5 did also allow to verify the clock synchronization estimation stability for all 8 nodes. No increase with respect to the results for adjacent nodes was measured, except some estimation stability changes due to the non-linearity of the TDC. The latter could be further improved in order to increase the stability of the phase estimator. It needs to be noted that the synchronization is performed in real time on the same FPGA that is taking care of data pre-processing and communication to neighboring



Fig. 3: Estimation stability while increasing message exchanges.



Fig. 4: Transfer function of the phase estimator applied to two adjacent nodes.

nodes. Further, the network itself was successfully tested when the data communication rate is 2 Gb/s and bit error rate is < 1%.

# V. CONCLUSION

We presented a solution to scalable photonic module synchronization when a network-based approach to PET is used. This approach was designed for SPADnet photonic modules and is suitable for small preclinical and large human PET, both single- and multi-ring for full-body PET scanners. It is based on a hybrid solution, where a hard-wired clock synchronization network is combined with a network-based clock offset estimator. The synchronization scheme was tested in an 8 node system, using actual FPGA boards of SPADnet modules, achieving synchronisation down to 160 ps. This performance can be easily improved by increasing the message exchanges and correcting the FPGA TDC non-linearities. We



Fig. 5: 8 node SPADnet test setup. For this setup an estimation stability of 157 ps was measured in the worst case.

also regularly monitor the phase offsets, thus allowing to compensate for temperature changes, aging, and other influences.

### ACKNOWLEDGMENT

The authors are grateful to Xilinx Inc. for FPGA donation and to Raj Thilak Rajan and Sundeep Prabhakar Chepuri for their valuable time in providing feedback and comments.

### REFERENCES

- Y. Haemisch, T Frach, C Degenhardt, and A. Thon. Fully digital arrays of silicon photomultipliers (dSiPM) - a scalable alternative to vacuum photomultiplier tubes (PMT). *NSS/MIC IEEE*, pages 2383–2386, Oct 2009.
- [2] S. Mandai and E. Charbon. Multi-channel digital SiPMs: Concept, analysis and implementation. *NSS/MIC IEEE*, pages 1840–1844, Oct 2012.
- [3] L. H. C. Braga, L. Gasparini, L. Grant, R. K. Henderson, N. Massari, M. Perenzoni, D. Stoppa, and R. Walker. A fully digital 8x16 SiPM array for PET applications with per-pixel TDCs and real-time energy output. *JSSC*, pages 301–314, Jan 2014.
- [4] E. Charbon, C. Bruschini, C. Veerappan, L. H. C. Braga, et al. SPADnet: A fully digital, networked approach to MRI compatible PET systems based on deep-submicron CMOS technology. NSS/MIC IEEE, 2013.
- [5] C. Veerappan, C. Bruschini, and E. Charbon. Sensor network architecture for a fully digital and scalable SPAD based PET system. *NSS/MIC IEEE*, 2012.
- [6] C. Veerappan, C. Bruschini, and E. Charbon. Distributed coincidence detection for multi-ring based PET systems. *IEEE Real time conference*, 2014.
- [7] R. T. Rajan and A-J. van der Veen. Joint ranging and clock synchronization for a wireless network. *Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP)*, 2011.
- [8] D. W. Allan. Time and frequency (time-domain) characterization, estimation and prediction of precision clocks and oscillators. *IEEE transactions* on ultrasonics, ferroelectrics, and frequency control, 34(6):647–654, 1987.
- [9] H. Homulle, F. Regazzoni, and E. Charbon. 200 MS/s ADC implemented in a FPGA employing TDCs. *Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays*, March 2015.