# Extending Vertical FET for Advanced Logic Scaling with Architecture Innovations: Dual-sided Global Signal, Flip VFET and Omega Nanosheet

Ziqiao Xu\*, Yanbang Chu\*, Yimeng Wang, Siyuan Liu, Zhihao Wang, Xinyue He, Yu Liu, Haoran Lu, Yandong Ge, Xun Jiang, Yongqin Wu, Lijie Zhang, Weihai Bu, Yibo Lin, Runsheng Wang, Ming Li, Heng Wu†, Ru Huang \*Contributed Equally, School of Integrated Circuits, Peking University, Beijing 100871, China, †email: hengwu@pku.edu.cn

Abstract — For the first time, vertical FET (VFET), as the most important device candidate for energy-efficient computing (EEC), is comprehensively studied in terms of the interconnect routability, device scalability and current drivability. Based on previously reported dual-sided VFET (DSVFET) with better device symmetry and area efficiency, dual-sided global signal (DGS) routing by flipping certain nets to the wafer backside (BS) was successfully developed beyond the frontside (FS) global signal routing (FGS). With DGS, the studied RISC-V core gains ~15% less power at isofrequency. Based on the novel DGS concept, the scalability of DSVFET was also explored. By scaling down the vertical Nanosheet (NS) pitch, ~14% less energy-delay product (EDP) at iso-frequency and ~24% smaller area were confirmed at block level. For the ultimate scaling, a novel Flip VFET (FVFET) with self-aligned back-to-back VFET stacking was proposed and evaluated, delivering ~33% extra std. cell area reduction. Furthermore, to overcome the current drivability concerns, the Omega NS channel for DSVFET was also proposed and experimentally demonstrated for the first time, providing 37.5% more effective gate width (Weff) and ~6% frequency gain without any area penalty. This work provides new solutions to the key issues of the VFET, paving its way for future EEC applications.

#### I. INTRODUCTION

As the lateral scaling of transistors touches the physical limit, the co-optimization of device structure and interconnects architecture becomes increasingly important for further scaling and requires comprehensive study. Meanwhile, compared with other emerging advanced 3D device structures such as Lateral GAA [1], Complementary FET (CFET) [2] and Flip FET (FFET) [3], the VFET features the GAA nanosheet channel with vertically stacked S/D/G and zero diffusion break (ZDB), enabling low leakage, smaller device footprint [4-5] and uniquely low intrinsic parasitics [6]. The DSVFET, an optimized variant of the conventional VFET, was recently proposed and verified [7], featuring superior device performance, further reduced area from the symmetric Source/Drain placed on each side of the wafer, and better routability thanks to its BS signal tracks [7]. This makes it a promising candidate for future EEC applications requiring high density and energy efficiency.

However, the VFET technology still faces several critical challenges: the dual-sided global signal is still yet to be implemented to fully utilize the routing space on both sides of the wafer; its scaling path is not fully explored for the continuous density improvement in terms of processes and designs; new solutions are needed to overcome the limited current drivability due to low Weff to footprint ratio of the vertical nanosheet. These greatly obscure the technical path of the VFET and call for systematic study on it.

Therefore, in this work, we first developed a new DGS routing methodology based on net flipping in the DSVFET P&R flow, which effectively lowers the optimization cost of the DGS with clear Power-Performance-Area (PPA) benefits over the FGS. Meanwhile, beyond advanced BEOL interconnects, the FEOL scalability of DSVFET was also fully investigated by scaling the vertical NS pitch and introducing the novel back-to-back-stacked device structure (FVFET). Lastly, to enhance the device's drivability, a new Omega-

shaped vertical channel structure was proposed and experimentally demonstrated, with no area penalty but clear improvement in the  $W_{\rm eff}$  to footprint efficiency.

### II. DUAL-SIDED GLOBAL SIGNAL IMPLEMENTATION

#### A. DGS Enablement: Std. Cell Pin Assignment on Both Sides

Different from conventional lateral transport transistors (GAA, CFET or FFET) with dual-sided interconnects, DSVFET's drain electrode can only be accessed from either side of the wafer, as in Fig. 1. Consequently, the std. cell output pins can only exist on one side, requiring all input pins of the net to be placed on the same side as the driver output pin. This constraint requires grouping nets into FS and BS to enable the DGS routing [8]. To solve this, we developed a net flipping methodology for net grouping by flipping part of the FS signal nets into the BS. This methodology is based on flexible assigning the cell pin location, requiring each std. cell's pin can be flipped [9-10] between the FS and BS freely in DSVFET cells (DS PIN cells) (Fig. 2). Fortunately, all std. cells in the DSVFET library satisfy this requirement due to the abundant dual-sided intra-cell signal tracks [7].

# B. Performance Evaluation of Cells with DS Pins

Flipping the std. cell pins from the FS to the BS is key to the DGS routing, which affects the parasitics of std. cell as well [11]. A 15 stages ring oscillator (RO) with BEOL load [12-14] was used to evaluate circuit performance. Flipping the INVD1's output pin from the FS to BS (Fig. 3) provides 7% reduced input pin capacitance and 37% less output pin capacitance (Fig. 4), resulting in 10% higher RO frequency at iso-power and 33% reduced power at iso-frequency.

DSVFET std. cell libraries with DS pins were characterized for further cell performance assessment. The AOI21D1, a typical complex std. cell, was chosen to demonstrate the impact of pin flipping on key cell power and speed metrics. The AOI21D1 (3 input pins + 1 output pin) has 16 types of pin placement configurations (Fig. 5). Fig. 6 shows key metrics extracted from the library with various pin placement configurations, showing distinct performance for cells, in other words, highlighting the importance of careful selection of std. cell pin placement for the DGS routing.

# C. Net Flipping Methodology

For practical usage, the distinct PP resulting from different pin placement configurations urges an efficient algorithm to narrow down the search space of numerous flipping possibilities for the DGS. As a solution, a novel parasitics-aware net flipping method was developed and implemented into the DSVFET's chip physical design workflow (Fig. 7), based on a RISC-V core. After the standard clock tree synthesis, the unique net flipping begins with the evaluation of net capacitances, estimating the net parasitics change after the flipping. Defined as the ratio of BS nets to all nets, the net flipping ratio is then determined. Subsequently, a dynamic programming (DP) algorithm is executed to search for the optimal grouping of FS and BS nets. This DP algorithm considers the PP of each cell and the total pin capacitance of each net. In the following, the net flipping process is executed to flip nets in the netlist according to the grouping result. Thanks to the flexible assignment of the cell pin location in DSVFET as described previously, the nets

can be flipped freely, not blocked by other pins on the target side. Finally, FS nets and BS nets are routed independently to realize the optimized dual-sided global signal network (Fig. 8).

The net flipping ratio should be carefully optimized for the DGS due to its big impact on routability. Fig. 9(a) gives the PP results of RISC-V cores with different net flipping ratios. It shows that the core's power drops dramatically with higher ratio up to 50% and achieves the optimum at 50%, as verified in Fig. 9(b). Considering the symmetric nature and the abundant interconnect resources in DSVFET from both sides of the wafer, the default range of net flipping ratio for the DGS routing is set to be from 40% to 60% in this study. The proposed DP algorithm is further benchmarked with random net flipping results (Fig. 10), showing good convergence of the optimal points in the whole dataset with great effectiveness. However, it should be admitted that further algorithm optimization is still required. Currently, net flipping is treated as an independent action without affecting others, which, however, could be coupled considering the placements of all fan-in and fan-out nets of a cell determine the parasitics of the cell together in practice. The optimization space can be evidenced by the random net flipping points outperforming the DP algorithm point in Fig. 10.

# D. Block Level PPA Evaluation

Based on the DP algorithm and the optimal flipping ratio range, a RISC-V core was further studied on DSVFET at  $0.35~V~V_{DD}$  for EEC, proving clear benefits of the newly implemented DGS method over the normal FGS by 15% less power at iso-frequency and 16% higher frequency at iso-power (Fig. 11(a)). As for the reason, the energy consumption breakdown of the two (Fig. 11(b)) shows that the dynamic energy of DGS is reduced greatly by 13%, due to lower cell and net parasitics (Fig. 11(c)), indicating that the net flipping is a feasible way to realize the DSG with obvious PP gains. Most importantly, this net flipping methodology relies only on the dual-sided pin accessibility, regardless of the device structure, proving the great extendibility for other architectures.

#### III. AGGRESSIVE SCALING FOR VFET

#### A. Pitch Scaling on DSVFET

The DGS routing can enable further VFET cell size reduction with greatly relaxed routing congestion. Therefore, the scalability of DSVFET was further investigated at scaled vertical NS pitch. Considering the gate stacks filling issues at small NS space, min. NS space of 12 nm (NS pitch = 18 nm) was chosen, whose feasibility has been confirmed in lateral-GAA [1] (Fig. 12(a)). The vertical NS orientation (xNS or yNS in Fig. 1) was also considered for its effects on layout design and parasitics. The scaling on DSVFET further reduces INVD1's area by 25% (Fig. 12(c)), while the pin capacitance of xNS INVD1 is smaller than yNS INVD1, consistent with the better PP of xNS in RO study (Fig. 13). Thus, xNS design was selected for block level evaluation of scaled DSVFET.

As for the key metrics of EEC applications, the scaled DSVFET outperforms the baseline by 14% lower power and 14% lower EDP at iso-frequency on the block level (Fig. 14). Moreover, it also provides 26%/24% smaller core area at iso-EDP/utilization, respectively, enabling further device density improvement (Fig. 15).

# B. VFET Stacking by FVFET

To break the limit of 1-tier planar-integrated VFET, a novel 2-tier VFET with back-to-back stacking on both sides of the wafer, namely the FVFET, was proposed as the ultimate form for VFET scaling (Fig. 1), for the first time. As in Fig. 16, the FVFET process starts with the self-aligned vertical NS formation for FS and BS VFETs, followed by the standard VFET FEOL and BEOL processes on the frontside. Then, a carrier wafer is bonded to the frontside of

the active device wafer. After wafer flipping, the BS vertical NS is then revealed by stripping the substrate and a selective recess of the STI. Finally, the subsequent BS VFET's FEOL, vias to connect the FS and BS devices and BS VFET's BEOL are formed (Fig. 16).

With 2 signal tracks and 1 shared power rail on each side in xNS design [3], FVFET can achieve very aggressive scaling with 2.5T cell height (Fig. 17(a)). However, to form the transmission gate (TG), the far separation between FS and BS VFETs' top epi poses very challenging intra-cell routing. As a solution, a novel connection strategy was proposed, in which the dual-sided output pin [3] of the preceding cell is used to connect the S/D of FS/BS VFETs in the TG [3,7] (Fig. 17(b)). This novel structure greatly saves area to connect the FS/BS VFETs' top epi, enabling std. cell area reduction by ~14% and ~33% for FVFET-yNS and FVFET-xNS (Fig. 17(c)), respectively. The RO simulation with typical BEOL loads further validates that the cell area reduction can effectively compensate the worse parasitics within the cell, proving FVFET-xNS's potential to scale ultimately while maintaining the performance (Fig. 18).

#### IV. OMEGA NANOSHEET FOR DRVIBILITY

Transistor scaling also requires boosting the drivability. Normal NS DSVFET architectures face limitations in increasing the W<sub>eff</sub> without area overhead. To solve this issue, for the first time, a novel structure called Omega NS (Fig. 1) was applied to the DSVFET. The process of Omega NS DSVFET is fully compatible with the normal DSVFET process, only differs in the NS formation loop (Fig. 19), in which, ring-shaped NSs are formed first, then etched in the middle to form the Omega NSs.

This novel concept was further successfully demonstrated by experiments: the initial ring-shaped NSs (Figs. 20(a-b)) after mandrel strip and Si NS etch, the omega NSs after the NS cut (Fig. 20(c)), enlarged view of omega NSs (Figs. 20(d-e)). This further verifies the relatively low-effort introduction of Omega NS into the DSVFET process flow with minor modification in the active cut. Note that, like normal vertical NS, Omega NS can also increase Weff by expanding the NS footprint, as proven in Figs. 20 (d-e) for different lengths, validating Omega NS is a compatible add-on performance booster for VFET. Meanwhile, its enhanced drivability from larger Weff was confirmed at device level (Fig. 21). Despite the capacitance increase from larger Weff, the RO simulation confirms the dominance of current improvement over capacitance (Fig. 22), showing a 6% frequency gain at iso-power. The Omega NS could greatly extend the applications of DSVFET beyond the EEC.

## V. CONCLUSION

This work comprehensively investigated the key aspects of the VFET as a future logic device candidate, including routability, scalability and drivability. For the first time, a novel and extendable net flipping methodology for DGS was proposed with clear PPA benefits. Together with the DGS routing, architecture innovations of pitch scaling, FVFET, and the Omega NS collectively advance the VFET roadmap for next-generation logic technology beyond EEC.

#### ACKNOWLEDGMENT

This work was supported in part by the National Key R&D Program of China under Grant 2023YFB4402200; in part by the 1+1 Project under Grant QYJS-2023-2301-B; in part by the NSFC under Grant 92464206; and in part by Grant JWZQ20240101004.

#### REFERENCES

[1] N. Loubet et al., VLSI 2017. [2] C. Zhang et al., IEDM 2024. [3] H. Lu et al., VLSI 2024. [4] A. Veloso et al., IEDM, 2019. [5] H. Jagannathan et al., IEDM, 2021. [6] G. Tsutsui et al., IEDM, 2022. [7] Y. Liu et al., VLSI 2025. [8] S. Choi et al., TED 2025. [9] J. Ahn et al., DAC 2025. [10] H. Lu et al., DATE 2025. [11] J. Lee et al., VLSI 2025. [12] W. Peng et al., VLSI 2025. [13] A. Farokhnejad et al., IITC 2022. [14] H. Wu et al., EDTM 2025.



Fig. 1. Evolution roadmap of the VFET including the enhancement of drivability, ultimate device scaling and advanced interconnects



Fig. 2. Schematic of the net flipping methodology. DS PIN cells enable the FS net flipping to the BS net.



Fig. 3. Cross sections of FS and DS PIN INVDI cells in cross gate cut.



Fig. 4. The power-performance and pin capacitances (inset) between ROs with DS PIN INVD1 and FS PIN INVD1.

Fig. 5. The 16 types of pin placement configurations of DSVFET AOI21 (right) and layout of the 2 highlighted ones (left).

Fig. 6. Comparison of cell delay, transition time, internal power and pin capacitance of 16 AOI21 designs as labeled in Fig. 5.



Fig. 7. Workflow of dual-sided global signal routing Fig. 8. Physical designs of RISCbased on parasitic-aware net flipping methodology.

V core with FGS and DGS.

Fig. 9. (a) Power-performance across various net flipping ratios. (b) Normalized power consumption vs. net flipping.



Fig. 10. The PP results by DP net flipping compared to the distribution (as the background colormap) from 500 PP results by random net flipping.



Fig.11. Block-level benchmark of Dual-sided Global Signal (DGS) by net flipping method vs. the Frontside Global Signal (FGS). (a) The power-performance. (b) The breakdown of energy consumption. (c) The comparison of total net cap after the physical implementation.



Fig. 21. Comparison of device characteristics between Omega NS and Normal NS DSVFETs: (a) Id-Vg and Cgg-Vg curves. (b) Ieff of NMOS and PMOS. (c) Ceff of NMOS and PMOS.

Fig. 22. Comparison of RO PP for Omega NS and normal NS DSVFETs.