# **CMOS Integrated Nanophotonics – Enabling Technology for Exascale Computing Systems**

Solomon Assefa, William M. J. Green, Alexander Rylyakov, Clint Schow, Folkert Horst\* and Yurii A. Vlasov IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA; \*IBM Zurich GMBH, Rueshlikon, Switzerland

**Abstract:** CMOS Integrated Nanophotonics allows ultra-dense monolithic single-chip integration of optical and electrical functions. This technology can enable future Exaflops supercomputers by connecting racks, modules, and chips together with ultra-low power massively parallel optical interconnects.

© 2010 Optical Society of America

OCIS codes: (200.4650) Optical interconnects; (130.0250) Optoelectronics; (250.5300) Photonic integrated

#### 1. Introduction

High-performance computing (HPC) systems capable of delivering Exaflops (10<sup>18</sup> floating point operations per second) performance are envisioned to become a reality by the end of this decade. Scaling computing systems to Exaflops will require tremendous increases in communications bandwidth, but with greatly reduced power consumption per communicated bit as compared to today's Petaflops machines. To enable such HPC systems, hundreds of millions of optical interconnects will be required at all system levels connecting racks, modules and chips together. CMOS Integrated Nanophotonics technology allows monolithic integration of deeply scaled optical circuits into the front-end of a standard CMOS process, thus opening a way towards massively parallel Terabit/sec-class optical transceivers on a single CMOS die, drastically decreasing the cost and power consumption.

### 2. CMOS front-end integration of nanophotonic components

Here we report the proof-of-principle demonstration of dense integration of electronic and nanophotonic components performed at the IBM Research pilot CMOS line using 200 mm SOI wafers (SOITEC) having a 220 nm silicon device layer on top of a 2  $\mu$ m BOX.



**Figure 1**. **A).** Schematic flow diagram of processing flow in the front-end of the CMOS line. Several nanophotonics processing modules are added at corresponding steps to incorporate modulators, detectors, and fiber couplers into the CMOS circuitry. **B).** Microphotograph of a part of a CMOS die with tightly integrated CMOS and Nanophotonics circuitry comprising break-out CMOS digital (RO) and analog (TIA/LA, DRV) circuits as well as full 6-channel receiver (6-ch RX) and 6-channel transmitter (6-ch TX).

Several processing modules have been added to a standard CMOS processing flow at the front-end of the line, as shown in Fig.1A. These modules require a minimal number of additional unique masks and processing steps, while sharing most mask levels and processing steps with the rest of CMOS. For example passive waveguides and electro-optical and thermo-optical modulators require only a single additional mask in the flow [1-2], since they share the same silicon device layer with CMOS PFETS and NFETS. To build a high-performance Ge photodetector we developed a "Ge-first" integration approach that utilizes a rapid melt growth technique concurrent with the source-drain anneal step [3-4]. As opposed to traditional approaches, where Ge is typically grown by CVD after the source-drain anneal step (hence can be called "Ge-last"), the "Ge-first" approach enables the sharing of many CMOS steps and mask levels, thus minimizing cost, while yielding a very thin (100 nm) defect-free Ge layer. Following these developments

the full CMOS maskset was designed with 130 nm CMOS ground rules, and with critical nanophotonics levels having 65 nm ground rules.

### 2. Design and performance of CMOS circuits after Nanophotonics integration

The CMOS circuits were designed in a standard digital IBM 130 nm bulk CMOS process. To ease the proof-of-principle demonstration the circuits were designed with only a single copper interconnect layer, using polysilicon as a second conductive layer for intersections when necessary. This design limitation, although adequate for a proof-of-concept demonstration, significantly lowers the speed and increases the power dissipation of the circuit.

## 2.1. Digital CMOS circuits. Ring Oscillator.

The CMOS ring oscillator (RO) testsite is taken from a standard IBM digital library and consists of 65stage ring oscillator, 10-stage ripple counter, and 4-stage divider (see Fig. 2a). After integration of all nanophotonics processing modules the RO exhibits 12 ps delay per stage at saturation with 1.5 V bias. This performance is on-target for 130 nm bulk technology.



Figure 2. Schematics, die photos, and performance results of CMOS digital circuits (ring oscillator) (a) and CMOS analog circuits (transimpedance and modulator driver amplifiers, respectively) (b, c) and Nanophotonic circuits (cascaded Mach-Zehnder WDM) (d) after completion of the full integrated process flow.

#### 2.2. Analog CMOS circuits. Receiver amplifier.

The receiver amplifier (see Fig. 2b) consists of a DC-coupled common-gate TIA occupying 170  $\mu$ m x 40  $\mu$ m, followed by a cascade of seven current mode logic (CML) buffer LA stages, followed by a singleended open-drain output driver. The total area occupied by the multi-stage LA and output driver is 160  $\mu$ m x 50  $\mu$ m. The total circuit area is dominated by decoupling capacitors used to minimize the supply voltage noise. After incorporation of all the nanophotonics processing modules, the receiver circuits exhibit open eye diagram at 5 Gbps with total power dissipation of about 28 mW (or about 5.6 pJ/bit).

## 2.3. Analog CMOS circuits. Transmitter modulator driver.

Modulator driver circuitry (see Fig. 2c) consists of the input pre-driver (105  $\mu$ m x 175  $\mu$ m) made of a 6-stage differential current-mode logic amplifier, and an output stage (105  $\mu$ m x 70  $\mu$ m) consisting of a cascoded differential driver with a dedicated power supply. The total circuit area is dominated by decoupling capacitors used to minimize the supply voltage noise. High-speed operation of the output stage is enabled by a differential pair of low-threshold thin oxide NFETs. High voltage, high output swing capability of the output stage is enabled by a pair of long channel thick oxide devices in a cascode configuration, protecting the thin oxide devices from the output voltages. After incorporation of all the nanophotonics processing modules, the transmitter circuits exhibit open eye diagram at 5 Gbps with total power dissipation of about 36 mW (or about 7.2 pJ/bit).

## 3. Design and performance of Nanophotonic components

Nanophotonic channel-type waveguides (a.k.a. "photonic wires") with 200 nm x 500 nm cross-section exhibit propagation loss below 3dB/cm and bending losses less than 0.02 dB per 90°-bend with 6  $\mu$ m radius. A cascaded Mach-Zehnder (CMZ) 8-channel WDM filter was designed using these photonic wires

## OMM6.pdf

following the approach described in [5]. Three consecutive CMZ stages (see Fig.2d) provide 400GHz (3.2nm) channel spacing with 2nm flat-top passband and less than -10dB cross-talk. Utilization of high-confinement photonic wires and  $\mu$ m-size bends reduce the total area of the device to just 360  $\mu$ m x 170  $\mu$ m. The device after completion of the CMOS process flow (see Fig.2d) exhibits 8 flat-top 2 nm-wide passbands with in-band ripple less than 1 dB and cross-talk less than -10 dB.

A library of various designs of electro-optical modulators based on carrier injection [2] or carrier depletion in a PIN diode configuration have been included in the mask. Single ended, differential and push-pull configurations with various doping profiles for the PIN diode were explored. Various optical designs based on both broadband Mach-Zehnder interferometers (MZI) as well as on resonantly enhanced ring assisted MZI and stand-alone ring resonators in all-pass and add-drop configurations were also fabricated. The small signal response of devices in reverse bias exhibits an almost flat RF response with a 3 dB cutoff exceeding 20GHz.

An analogous library of Ge photodetectors based on evanescently coupled metal-semiconductor-metal longitudinal configuration of electrodes has also been tested [3-4]. Small signal response showed 3 dB cutoff beyond 40 GHz at 1.5-2.0 V bias with some devices producing up to 10 dB avalanche gain at the same low bias conditions.

## 4. 6-channel Nanophotonic transceiver

Figure 3 shows a die photographs of a 6-channel transmitter (Fig. 3a) and 6-channel receiver (Fig. 3b).



Figure 3. Micro-photographs of a die with a 6-channel transmitter (a) and 6-channel receiver.

The footprint of a transmitter (Fig.3a) counted as shown in Fig.1B is 3700  $\mu$ m x 350  $\mu$ m. The area per channel of 0.21 mm<sup>2</sup> is limited by the pad frame and decoupling capacitors. The area of the 6-channel receiver (Fig.3b) counted the same way is 3700  $\mu$ m x 500  $\mu$ m that gives an area per channel of 0.31 mm<sup>2</sup>. Correspondingly, with the current limitations the area per single transceiver channel can be estimated as 0.52 mm<sup>2</sup>, which is approximately 10 times smaller than previous demonstrations [6].

#### Conclusions

Proof-of-principle demonstration of dense monolithic integration of nanophotonic components with analog and digital CMOS circuitry has been performed. The integration density offered by the technology is at least 10 times higher than previous reports. The technology offers a solution for massively parallel Terabit/sec-class optical transceivers with 50 channels supporting 20 Gbps line rate each that will occupy only 4 mm x 4 mm on a single CMOS die.

#### References

- 1. J. Van Campenhout, et al., Optics Letters 35(7), 1013-1015, 2010;
- 2. J. Van Campenhout, et al., Optics Express 17(26), 24020-24029, 2009
- 3. S. Assefa, et al., IEEE Journal of Selected Topics in Quantum Electronics, 16(5) 1376-1385 (2010).
- 4. S. Assefa, et al., Optics Express 18(5), 4986-4999, 2010
- 5. C.G.H. Roeloffzen et.al IEEE Photonics Technology Letters, 12, 1201-1203 (2000).
- 6. B. Analui et al; IEEE Journal Solid State Circuits, 41, 2945 (2006)

Acknowledments. Contributions of many of our colleagues across various organizations at IBM Research are gratefully acknowledged and especially Fengnian Xia, Leathen Shi, Jeffrey Sleight, Young-Hee Kim, Chris Jahnes and the staff at the MRL.