Introduction

Motivation

Printed Circuit Board (PCB) reverse engineering1,2,3,4,5,6,7,8 which is the process of analyzing and reconstructing the design of an existing PCB is crucial for addressing various motivations, including part obsolescence, intellectual property recovery, compliance and quality assurance, and enhancing national capabilities.

Part obsolescence

When original design files are lost, or components become obsolete, reverse engineering helps in recreating the PCB layout and generating a netlist, which is a detailed list of the electronic components and their interconnections, to support continued production and maintenance. The backlog of parts, devices, and machines due to obsolete PCBs is a significant issue, particularly in highly regulated industries like medical devices and automotive manufacturing9. In the medical device industry, component obsolescence can cause serious delays and financial losses. For example, the sudden discontinuation of essential components can halt production and require extensive regulatory approval for replacements, leading to months of downtime for devices waiting on critical parts10. In the automotive sector, supply chain disruptions have similarly led to considerable backlogs. Jaguar Land Rover (JLR) reported a backlog affecting up to 10,000 cars at its peak due to parts shortages, including those related to obsolete PCBs11. While improvements have been made, the backlog still numbers in the thousands. The PCB market, which supports a wide range of industries from consumer electronics to defense, faces constant challenges due to the rapid pace of technological advancements and obsolescence. According to Market Data Forecast, the market size is projected to grow from USD 76.12 billion in 2024 to USD 93.87 billion by 2029, indicating the scale at which obsolescence and supply chain issues could impact production across various sectors12.

Intellectual property (IP) recovery

In some cases, reverse engineering helps in recovering intellectual property13 when the original design data is no longer available, ensuring that the design can be reused or modified. This differs from obsolescence management in that parts or devices may not be obsolete, but the IP is still lost.

Compliance and quality assurance

Ensuring that a product complies with industry standards and regulations sometimes requires reverse engineering to verify the design and implementation. Additionally, the “build” must sometimes be compared against the “designed” specifications through a verification and validation process to ensure the fabrication process is reliable and that the parts are indeed the ones that were ordered14.

Enhancing national capabilities

To improve national capabilities, the designs of other offshore parties may need to be reverse engineered to be understood and potentially replicated or enhanced15.

Review of current methods and their challenges

Reverse engineering methods for PCBs can be broadly categorized into two types based on their approach: non-geometry-based methods that rely on functional testing, and geometry-based methods that utilize visual and imaging techniques.

Category 1: non-geometry-based methods through electrical testing

These methods focus on assessing the functionality of the PCB without extracting the physical layout of traces and junctions: (1) Continuity Testing: This method uses a multimeter to test the continuity of traces and connections. By verifying the electrical pathways, continuity testing helps in creating a netlist, which is a representation of the electrical connectivity of the PCB16,17. In terms of the drawbacks and Limitations, continuity testing is limited to simple PCBs and can be time-consuming for complex boards. It does not provide detailed physical layout information, making it insufficient for comprehensive reverse engineering; (2) In-Circuit Testing (ICT): Specialized equipment is used to test the functionality of the PCB while it remains assembled. ICT provides insights into the circuit’s operation and helps identify hidden connections, ensuring that the PCB performs as intended18. In terms of drawbacks and limitations, ICT may not detect all faults, particularly those related to intermittent issues or subtle component defects.

Category 2: geometry-based methods through imaging and visualization

These methods involve extracting the physical layout of the PCB through imaging techniques, followed by an analysis step to reconstruct the PCB’s design1,2,3,4,5,6,7,8.

Step 1: imaging

When inspecting and analyzing printed circuit boards (PCBs), different techniques must be employed depending on whether the board is single-layer or multi-layer: (1) Single-Layer PCBs: For PCBs with only one layer, visual inspection and manual tracing can be employed. Alternatively, using a camera or scanner can capture the necessary details19. Drawbacks and Limitations: Manual tracing is labor-intensive and prone to human error. Conventional approaches of automated imaging can miss fine details; (2) Multi-Layer PCBs: For PCBs with multiple layers, imaging buried layers is crucial. Two main methods are used: (a) Destructive Methods: This involves consecutive delayering and imaging of the PCB. Methods for delayering include: (a1) Chemical Stripping: Using chemicals to remove layers20. Drawbacks and Limitations: Chemical stripping can damage sensitive components and requires careful handling of hazardous substances; (a2) Mechanical Stripping: Grinding or milling away layers20,21. Drawbacks and Limitations: Mechanical stripping can introduce physical distortions and inaccuracies; (a3) Focused Ion Beam (FIB): Precision removal using ion beams22. Drawbacks and Limitations: FIB is limited to small areas, making it impractical for large-scale analysis; (a4) Laser Ablation: Using lasers to remove layers7,8. Drawbacks and Limitations: Laser ablation requires fine tuning of the recipe parameters which may in turn need significant experimentation. Imaging techniques for these methods include optical microscopy7,23, confocal microscopy8, and scanning electron microscopy (SEM); (b) Non-Destructive Methods: X-ray Computed Tomography (X-ray CT) is commonly used for non-destructive imaging, which is particularly useful when only a single instance of the board exists and needs to remain functional after analysis1,2,5,6. Drawbacks and Limitations: X-ray CT may require fine-tuning of the imaging parameters, which can necessitate optimization efforts. Additionally, the image quality and level of detail that can be extracted can be affected by artifacts such as beam hardening.

Step 2: analysis

Once images are acquired, they must be analyzed to reconstruct the PCB’s design. Various methods include: (1) Manual Analysis: The traditional method involving human inspection and interpretation of images. Drawbacks and Limitations: It is labor-intensive, time-consuming, and prone to human error; (2) Conventional Image Processing: Utilizing algorithms to process and analyze images. Drawbacks and Limitations: This method requires fine-tuning for specific cases, limiting its universality1; (3) Machine Learning: Employing machine learning algorithms for image semantic segmentation, which involves classifying each pixel in an image into a predefined category, and image analysis. Drawbacks and Limitations: The effectiveness of this approach depends on the availability of large, annotated datasets, which are expensive to generate5,6; (4) Hybrid Approaches: Combining image processing with machine learning can leverage the strengths of both methods. Drawbacks and Limitations: Hybrid approaches still face challenges related to data availability and the integration of different methodologies.

An additional key challenge of the existing methods is that they are based on segmenting 2D image slices from a 3D volume, rather than segmenting the 3D volume itself. This approach faces challenges such as aligning the plane of images with the PCB layers and addressing distortions in the PCB shape. By treating images as 3D volumes, as presented in this paper, these issues can be mitigated.

Proposed solution

PCB image segmentation presents unique challenges due to the complex geometry and physical distortions of PCBs. Traditional methods segment 2D slices from a 3D reconstructed volume and then stack them, which introduces significant challenges when the image slices are not parallel to PCB layers—a common occurrence in image acquisition and reconstruction. This misalignment often requires substantial manual correction. Additionally, bent PCB layers cannot be effectively captured in 2D slices, leading to further inaccuracies. To address these limitations, we propose the first method for direct 3D semantic segmentation, which eliminates the need for slice alignment and ensures robustness against bending or distortions in PCB layers. Furthermore, deep learning-based segmentation methods typically require large, labeled datasets, which are costly and time-intensive to produce.

We propose a novel strategy for automated semantic segmentation of PCB X-ray CT images and extraction of netlist information by combining image processing and machine learning algorithms and using synthetic data for training.

Innovations of the proposed technique

Key innovations of the proposed solution include:

  1. 1.

    3D volume semantic segmentation: Unlike traditional slice-by-slice methods, our approach segments directly in 3D, preserving spatial context and effectively handling bent PCBs, which are challenging for 2D methods.

  2. 2.

    Synthetic data generation: We generate labeled synthetic data for training, eliminating the need for costly and time-consuming real-world data acquisition and annotation.

  3. 3.

    Combined image processing and machine learning: Our method integrates image processing for initial segmentation with machine learning for semantic segmentation, leveraging the strengths of both techniques.

  4. 4.

    Fully automated procedure: The process is entirely automated, reducing manual effort and ensuring robust, scalable PCB design reconstruction.

Note that, unlike other domains, such as medical imaging, PCB segmentation presents unique challenges, including complex multi-layered structures, thin high-contrast features (e.g., traces and vias), bending or misalignment during imaging, and the critical need to preserve precise connectivity information for netlist extraction. To the best of our knowledge, the proposed method is the first automated method using a 3D U-Net architecture specifically tailored for the reverse engineering of PCBs, addressing these unique challenges through a combination of synthetic data, preprocessing steps, and domain-specific validation.

Process

The process of the proposed solution is as follows:

  1. 1.

    Image acquisition: The process begins with acquiring high-resolution 3D images of PCBs using a CT scan system. The raw 2D projection data from the CT scan is reconstructed into 3D volumetric images, providing detailed views of the PCB’s internal structure.

  2. 2.

    Pre-processing: Pre-processing techniques, including noise reduction and contrast enhancement, are applied to improve image quality.

  3. 3.

    Copper content isolation: To isolate the metal content, a watershed segmentation algorithm is used. This step ensures accurate identification of traces and junctions within the PCB.

  4. 4.

    3D semantic segmentation: For semantic segmentation, we employ a 3D U-Net architecture with a pretrained ResNet-18 backbone. This network is trained using synthetic data, which simulates the variability and complexity of real-world PCBs.

  5. 5.

    Model optimization: The machine learning model is optimized using the Adam optimizer, with a combined dice and focal loss function to handle class imbalance and improve segmentation accuracy.

  6. 6.

    Validation: Validation is conducted on both synthetic and real-world datasets, achieving high performance metrics.

  7. 7.

    Post-processing: In the post-processing phase, the segmentation results are refined through overlapping dissection to ensure accurate boundary predictions, followed by voxel assignment and binary mask conversion. Additional operations, such as binary closing and small object removal, further enhance the segmentation quality.

  8. 8.

    Netlist extraction: Finally, netlists are extracted from the semantically segmented images through automated identification of connectivity between junctions. Junctions and nets are assigned unique identifiers, and the connectivity is analyzed to construct a pseudo-netlist, which is validated through comparison with known designs.

By offering a less expensive, less time-consuming, and more universal method for PCB design reconstruction, the proposed technique mitigates issues related to part obsolescence, intellectual property recovery, and compliance. Additionally, this technique has broader applications in various physical and biological sciences where 3D image segmentation is crucial.

Demonstration and validation examples

In this study, we demonstrate our methodology using a real commercial 3-layer PCB. This PCB serves as a practical example throughout the Methodology section, illustrating each step of our reverse engineering process. For the Validation and Case Studies section, we will additionally use a fully manufactured 2-layer PCB of our own design to evaluate the effectiveness and accuracy of our methods. This approach allows us to verify our results against known ground truth data and assess the robustness of our techniques in practical applications.

Methodology

Overview

Our process of automating the reverse engineering of PCBs includes acquiring high-resolution 3D images of PCBs, segmenting the metal content, performing semantic segmentation of the junctions within the metal content, and post-processing the results to extract netlists. This process is schematically demonstrated in Fig. 1. Throughout this section, we utilize a real commercial 3-layer PCB to demonstrate each step of our methodology, providing a practical and detailed example of our approach.

Fig. 1
figure 1

The process flow for the proposed PCB design reconstruction solution.

Imaging and data acquisition

X-ray CT

The first step in our methodology involves acquiring high-resolution 3D images of PCBs using an X-ray CT system. We used Zeiss Xradia 520 Versa X-ray in this work. It should be noted that there are inherent challenges in using this technique for PCBs with large length-width ratios. Specifically, in such cases, the length direction may fail to penetrate fully, or the short side may experience overexposure. To mitigate these issues, we employ a method that adjusts the number of exposures at different rotation angles of the sample. This approach helps to balance the trade-offs in penetration and exposure, enabling the acquisition of usable volumetric data for subsequent analysis. However, it is important to note that the focus of this work is not on image acquisition or optimizing imaging parameters. Instead, the primary goal of this study is to develop and demonstrate an automated methodology for the semantic segmentation of PCB images. A key strength of our proposed method is its robustness and generalizability, as it is designed to work effectively regardless of the imaging modality used. By validating our approach with CT images—often more challenging due to artifacts and limitations—we underscore the versatility of the method. Furthermore, we anticipate that the proposed methodology would perform even better with alternative imaging techniques, such as computed laminography (CL), which is better suited for scanning large PCBs. This flexibility ensures that our approach can accommodate a wide range of imaging conditions and technologies.

X-ray imaging parameters

The X-ray CT imaging machine is configured with optimal settings to balance resolution, scan time, and field of view. Parameters such as voltage, current, exposure time, voxel size, filter type, rotational speed, number of projections, and reconstruction algorithm are adjusted to ensure high-quality images while capturing the entire region of interest in the obtained images.

Image reconstruction

The raw 2D projection data from the X-ray CT scan are reconstructed into 3D volumetric images using Reconstructor Scout-and-Scan software (version 14.0) provided by Zeiss. The resulting images provide a detailed view of the PCB’s internal structure, including metal traces, vias, and junctions within the glass fiber material of the PCB.

Figure 2 shows samples of 2D slices of the 3D reconstruction of PCB.

Fig. 2
figure 2

2D slices of X-ray CT scan images from a commercial PCB. The pixel size is 14.12 μm. The image size is 1074 × 1074 pixels. The field of view is approximately 15.16 mm × 15.16 mm.

Metal content segmentation

Watershed segmentation

To isolate the metal content from the rest of the PCB materials, we apply a watershed segmentation algorithm24. This method is chosen for its robustness in handling varying intensities and noise in the CT images.

Pre-processing

The 3D CT images are pre-processed to enhance their quality. This involves applying Gaussian smoothing25 to reduce noise and improve the clarity of the metal regions, facilitating more accurate segmentation.

Seed generation

Seeds for the watershed algorithm are placed at the lower and upper percentiles of intensity values corresponding to specific glass fiber and metal content. The lower percentile seeds identify starting points within the glass fiber regions, while the upper percentile seeds ensure the inclusion of metal areas.

Watershed algorithm

The watershed algorithm26,27,28 treats the image as a topographic surface, where the intensity values represent the height. Starting from the seeds, the algorithm floods the regions to segment the metal content from the rest of the PCB. The process begins by identifying these markers, known as seeds, which serve as the initial points for region growth. The algorithm then simulates water flooding from these seeds, filling up catchment basins and delineating boundaries where different regions meet. This flooding continues until the entire image is segmented, effectively isolating the metal regions. This step produces a binary mask of the metal regions, allowing for precise segmentation despite varying intensities and noise present in the CT images.

Figure 3 shows the extracted metal content from the CT images of the commercial PCB presented in Fig. 2.

Fig. 3
figure 3

3D segmented metal content from the commercial PCB (created with Python 3.8 code).

Junction semantic segmentation

Segmentation using neural networks

We employ a 3D U-Net architecture29,30 with a pretrained ResNet-1831 backbone for the task of semantic segmentation of junctions in the extracted metal content. The U-Net architecture is chosen for its effectiveness in image segmentation tasks, while the pretrained ResNet-18 backbone enhances feature extraction capabilities while leveraging the training on ImageNet32.

U-Net structure

The U-Net consists of an encoder-decoder structure with skip connections. The encoder progressively down-samples the input, extracting hierarchical features, while the decoder up-samples the features to the original resolution. The 3D U-Net extends this architecture into the third dimension, employing 3D convolutions and 3D max-pooling layers to capture volumetric spatial context.

ResNet-18 backbone

The ResNet-18 backbone is integrated into the encoder to leverage its deep residual learning capabilities. This addition improves the model’s ability to capture intricate details in the input data.

Transfer learning

The pretrained 3D U-Net leverages transfer learning by initializing its weights from a 2D network trained on the large ImageNet dataset. This initialization helps in capturing low-level features effectively, which are then fine-tuned for the specific 3D segmentation task. The transfer learning approach reduces training time and improves model accuracy by utilizing the pretrained weights, allowing the network to adapt to the nuances of 3D data with a robust starting point.

Synthetic data generation

Creating a diverse and comprehensive dataset is essential for training our deep learning model for semantic segmentation of junctions from the 3D metal content. For this purpose, we leveraged our previously developed approach of creating synthetic datasets for semantic segmentation of PCB 2D images6 and adapted it for the generation of a 3D synthetic dataset. The synthetic dataset needs to simulate the variability and complexity of real-world PCBs.

2D synthetic image creation

To generate synthetic 3D PCB images, we first create 2D synthetic images with corresponding junction masks representing single-layer PCBs and the locations of the junctions. The steps involved are as follows.

Canvas setup

A blank canvas of size 1536 × 1536 pixels is initialized. This size is chosen to provide ample space for complex trace patterns while allowing for cropping to the desired final size of 1024 × 1024 (Note that 1536 = 256 + 1024 + 256).

Junction placement

A random number of junctions (10 ≤  n ≤ 20) are placed on the canvas. The junctions are represented as disks with random diameters, ensuring variability in the junction sizes.

Trace generation

A random number (m) of traces are drawn to connect pairs of junctions. The number m is chosen to be up to 25% of the possible connections (n×(n + 1)/8). This ratio ensures a balanced number of connections without overwhelming the canvas. The traces are either regular traces (with 90% probability) or serpentine traces (with 10% probability), a ratio chosen to reflect the typical prevalence of straight connections in PCB designs while still including a reasonable number of serpentine traces to account for design variations. The shape-defining parameters for these traces are chosen randomly, as described in definitions below:

  • Regular traces: These are the linear paths of random width used to connect various components on a PCB. These often need to change direction to connect different components or navigate around obstacles. Therefore, each regular trace is divided into segments, with a random number of bends or breaks (0–4). This variability allows the traces to adapt to the layout’s complexity and mimic real-world PCB designs.

  • Serpentine traces: These traces are designed with intentional loops or meanders to match the length of other traces or introduce delays. Serpentine traces are generated with a random number of peaks (1–10) and peak values (1–100 pixels) with random width. This design choice reflects the need for precise timing adjustments in PCB layouts, where serpentine traces are used to manage signal timing and integrity.

Figure 4 schematically shows the difference between straight and serpentine traces.

Fig. 4
figure 4

Difference between straight and serpentine traces is schematically demonstrated.

Cropping

The resulting 1536 × 1536 images and junction masks are cropped to 1024 × 1024 pixels. This is to have more realistic inputs for training the deep learning model as often the CT images may not cover the whole PCB or usually the size of the CT image is larger than the input size of the trained model.

Junction mask

As the PCB images are synthetically generated using known parameters, each image automatically includes its junction mask to be used as a label in the dataset for training the deep learning network. Figure 5 shows examples of synthetic samples generated in this manner (left column) along with their corresponding junction masks (right column). Note that although these synthetic PCB layouts may differ significantly from real-world examples due to the random placement of junctions and creation of traces, they effectively serve the purpose of training the machine learning algorithm for semantically segmenting PCB images. The value of these generated scenarios lies in their ability to cover a wide diversity of possible PCB layouts, thereby enhancing the algorithm’s robustness and generalization capabilities.

Fig. 5
figure 5

Examples of synthetic 2D PCB images (left) and their junction masks (right).

3D image assembly

The next step involves assembling the 2D images into 3D synthetic PCB images.

Layer placement

The goal of layer placement is to construct a synthetic 3D PCB image by systematically integrating multiple 2D PCB layers into a 3D volumetric space, ensuring realistic layer distribution and thickness variability. The 3D synthetic PCB assembly is performed as follows.

  1. 1.

    Initialization: We start with a blank 3D volumetric space of dimensions 128 × 1024 × 1024 voxels. This space represents the 3D structure of the PCB, where each slice corresponds to a layer in the z-dimension (depth).

  2. 2.

    2D synthetic PCB images: Prior to the 3D construction, we generate multiple 2D synthetic PCB images, each of size 1024 × 1024 pixels. These images represent single layers of the PCB, with randomly placed junctions and traces as described earlier.

  3. 3.

    Random placement of layers: To simulate the realistic distribution of PCB layers, 2D synthetic PCB images are placed approximately 10 slices apart within the 128-slice volume. This spacing ensures that the layers are not too densely packed, mimicking the actual structure of multilayer PCBs. The exact placement of each 2D layer within the 128 slices is randomized. For instance, a 2D layer could be placed at slices 1, 11, 21, etc., but the exact starting slice is determined randomly within a range to introduce variability. This prevents a uniform pattern and adds to the realism of the synthetic 3D image.

  4. 4.

    Layer thickness variation: Each 2D PCB layer is assigned a random thickness between 2 and 4 slices. This thickness variation is essential to emulate the non-uniformity observed in real PCBs where different layers may have different thicknesses. If a layer is assigned a thickness of 3 slices, for example, the same 2D synthetic PCB image is repeated over three consecutive slices in the z-dimension. This repetition maintains the continuity of the layer across the assigned thickness.

  5. 5.

    Filling the volume: The process of placing 2D layers, separated by approximately 10 slices and with random thicknesses, is repeated iteratively until the entire 128-slice volume is filled. This approach ensures that the 3D synthetic PCB image is fully populated with layers distributed throughout the volume.

Figure 6 schematically shows the first few steps of the layer placement procedure.

Fig. 6
figure 6

The first few steps of the layer placement procedure are schematically shown.

It should be noted that the layer placement method described here is solely for the purpose of generating synthetic 3D PCB images and does not apply to the processing or correction of stacking traces in real 3D reconstructed images.

Random 3D rotation

The CT images of multilayer PCBs do not necessarily have layers parallel to the surface plane of the PCB sample, as manual mounting can introduce an angle. Therefore, after creating the 3D PCB of size 128 × 1024 × 1024 voxels, the entire volume is randomly rotated around the x, y, and z axes to simulate different poses that the PCB can take when mounted for imaging in the X-ray CT machine or the different ways that the 2D slices are generated from a 3D volumetric X-ray CT image.

Figure 7 shows a synthetic 3D PCB sample and its corresponding junction mask, from two views.

Fig. 7
figure 7

A synthetic 3D PCB sample (top) and its junction mask (bottom), from two views. (created with Python 3.8 code).

Dissection for training

Due to the memory constraints of GPU, the input size of the 3D deep learning model is much smaller than 128 × 1024 × 1024 of the synthetic images. Therefore, to train the deep learning model, the 3D synthetic image volume and its corresponding mask volume are dissected into smaller sub-volumes.

Dissection parameters

Each 3D image and its corresponding mask are dissected into sub-volumes of size 64 × 128 × 128 voxels with a stride of 32 × 64 × 64 voxels. This overlapping dissection is not necessary during the training phase but is essential during the prediction phase to ensure that boundary regions are properly represented in at least one volume.

Final dataset

We created 250 whole synthetic 3D PCBs images and masks. With the dissection and the stride size described above, each whole 3D image and its corresponding mask are dissected into (1 + (128–64)/32) × (1 + (1024–128)/64) × (1 + (1024–128)/64) = 675 sub-images for training the deep learning network. Therefore, the dataset has a total of 168,750 samples.

Note that while reducing the reliance on large datasets is a common trend in deep learning, sufficient and diverse training data remain critical for properly training a model. Generating synthetic data addresses the challenges of collecting real-world data for PCBs, which is prohibitively expensive due to the labor and machine time required for X-ray CT imaging. Each scan can take several hours, and obtaining thousands of images would significantly impact resources. Moreover, changes in imaging conditions, such as resolution or beam intensity, would require collecting entirely new datasets. Synthetic data generation allows us to overcome these limitations by simulating diverse PCB scenarios, including variations in layer configurations, physical distortions (e.g., bent layers), and noise. This approach ensures scalability, adaptability, and robust training of the segmentation network.

Training procedure

Our 3D U-Net model with pretrained ResNet-18 backbone is trained on the generated synthetic dataset using the following procedure.

Optimizer and learning rate

The Adam optimizer is used with a learning rate of 0.0001. Adam is chosen for its adaptive learning rate properties, which help in achieving faster convergence.

Loss function

A combined dice and focal loss function is employed to enhance model performance in handling class imbalance and improving segmentation accuracy33,34,35. The dice loss component addresses class imbalance (metal content versus background) by focusing on maximizing the overlap between the predicted segmentation and the ground truth, which is particularly useful for small classes (content here). The focal loss component mitigates the impact of easy negatives (the background or non-target class) by down-weighting their contribution.

Training schedule

The model is trained for 25 epochs. Each epoch involves a full pass through the training dataset, adjusting the model parameters to minimize the loss function.

Validation and testing

The dataset is split into training (70%), validation (10%), and test (20%) sets. The best model is selected based on validation loss, and its performance is evaluated on the test set. The model achieved a test Intersection over Union (IoU) score of 0.974936 and an F1 score of 0.986837 at a threshold of 0.5.

Prediction and post-processing

Prediction

During the prediction phase, the input 3D images are dissected into sub-volumes for processing by the trained model.

Overlapping dissection

To ensure accurate predictions at the boundaries, the input image is dissected with a 1/4 overlap, resulting in sub-volumes of size 64 × 128 × 128 voxels with a stride of 16 × 32 × 32 voxels. This overlapping strategy mitigates boundary artifacts (Fig. 8).

Fig. 8
figure 8

Schematic representation of dissections (64 × 128 × 128) with a stride of (16 × 32 × 32), equivalent to a 25% overlap.

Voxel assignment

For voxels in the overlapping regions, the mean of the predictions from all overlapping sub-volumes is assigned. This averaging approach improves the accuracy of predictions for voxels at the boundaries of the sub-volumes.

Binary mask conversion

The output of the model is a Softmax probability map38, indicating the confidence of each voxel belonging to a metal trace or junction.

Thresholding

A predefined threshold of 0.35 is applied to convert the probability map into a binary mask. Voxels with probabilities above the threshold are classified as metal traces or junctions, while others are classified as background.

Post-processing

The initial binary mask undergoes post-processing operations to refine the segmentation:

  • Binary closing: A box structuring element is used to perform binary closing, which involves dilation followed by erosion. This process fills small holes and connects nearby components in binary images. The size of the structuring element is chosen based on the typical size of gaps and noise in the segmented images.

  • Small object removal: Objects smaller than a predefined size threshold are removed from the binary mask. This step eliminates noise and small artifacts that do not correspond to actual metal junctions.

Figure 9 shows the predicted and post-processed junction mask of the commercial PCB of Fig. 2, from two different views.

Fig. 9
figure 9

Predicted and post-processed junction mask of the commercial PCB, from two different views (created with Python 3.8 code).

Pseudo-netlist extraction

With the segmented 3D images identifying the junctions, we proceed to extract a pseudo-netlist that describes the connectivity between these junctions on the PCB. By assigning each junction to the pins of the components on the PCB, the complete netlist can be created.

Junction identifiers

Each identified junction in the predicted junction mask is assigned a unique name, such as J1, J2, etc. This is done by detecting connected regions within the extracted junction mask. In our current implementation, junctions that are close and connected are segmented as a single junction and thus receive a single identifier. Manual or automatic inspection of the complete board with attached materials can then be used to identify and assign the correct pins of the components to these junctions.

Net identifiers

Separate connected regions within the metal content are labeled using connected component analysis. Each component is assigned a unique net identifier namely, Net1, Net2, etc. Each segmented junction belongs to exactly one net.

Junction-net assignment

In the process of assigning junctions to their respective nets, each junction within the junction mask is linked to a specific net by analyzing which net’s voxels are present within the junction. In other words, each net is a connected region consisting of junctions and traces. To find which net a junction belongs to, we examine the intersection of the junction with all nets. Each junction must be entirely contained within a single component, ensuring accurate assignment.

Pseudo-netlist generation

The connectivity between components or junctions of the PCB is determined by analyzing the net assigned to the labeled junctions. Each net is a group of interconnected junctions using traces. The pseudo-netlist is then constructed by listing each net and its associated junctions.

Figure 10 shows the detected nets of the partially imaged commercial PCB in different colors as well as the detected junctions with their identifiers.

Fig. 10
figure 10

Colored nets and named junctions of the partially imaged commercial PCB (created with Python 3.8 code).

Validation and case studies

Three case studies are presented in this paper to validate the proposed methodology. The first case study, which has been used throughout the paper, is summarized in this section. We further validate our approach using a custom-designed, fully manufactured 2-layer PCB, thoroughly assessing the accuracy and reliability of our design reconstruction process. Additionally, we address the challenge of handling physically distorted PCBs by introducing a bent version of our 2-layer PCB. To address these distortion scenarios, we train a new deep learning network, demonstrating the robustness of our approach in dealing with real-world imperfections and deformations.

Commercial PCB

The first case study, involving a commercial PCB used throughout the Methodology section to illustrate various aspects of the proposed approach, is summarized in Fig. 11.

Fig. 11
figure 11

Summary of case study 1: commercial PCB (Segmentation visualizations were created with Python 3.8 code).

Custom 2-layer PCB

A custom 2-layer PCB was designed and imaged to further validate our approach. The known circuit design of this PCB provided a reliable reference for evaluating our reverse engineering method. The 2-layer PCB, manufactured with a known netlist, was scanned using a CT scanner. The segmentation and netlist extraction steps were then applied to the scanned images (Figs. 11, 12, 13, 14 and 15).

Fig. 12
figure 12

The designed 2-layer PCB mounted for imaging.

Fig. 13
figure 13

2D slices of CT scan images of the designed PCB. The pixel size is 57.81 μm. The image size is 1108 (W) × 1276 (H) pixels. The field of view is approximately 64.09 mm × 73.75 mm.

Fig. 14
figure 14

3D segmented metal content (left) and segmented junctions (right) of the designed PCB (created with Python 3.8 code).

As presented in Table 1, the pseudo-netlist of the designed PCB was extracted using the proposed method. By comparing to the known design, the accuracy of the method was successfully assessed.

Table 1 The extracted pseudo-netlist of the designed PCB.
Fig. 15
figure 15

Colored nets and named junctions of the designed PCB, obtained from the proposed method (created with Python 3.8 code).

Bent PCB handling

Physical distortions, such as bending, can complicate the reverse engineering process since PCB layers may not remain within flat planes that correspond to slices from the 3D image. To address this challenge, we created a dataset of bent PCBs and trained a separate 3D U-Net model specifically for this scenario.

Dataset creation

Synthetic 3D images of bent PCBs were generated by applying geometric transformations to the existing synthetic dataset. These transformations included bending, twisting, and warping to simulate real-world distortions. Figure 16 shows two slices of the content and the corresponding junctions of a bent synthetic sample used for training the network. As seen in the images, each slice only partially contains a PCB layer due to the geometric distortions in 3D.

Fig. 16
figure 16

Two slices of the bent synthetic content (left) and the corresponding junction labels (right).

Training and evaluation

The new model was trained using the same parameters as the original model but with the bent PCB dataset. To evaluate the performance of the new model, we assessed our method with a bent PCB of our 2-layer design. Using this trained network, our method was fully capable of reverse engineering the bent PCB. Note that the assignment of junction labels in Case Studies 2 and 3 represents two possible permutations corresponding to the same graph topology (i.e. connectivity arrangement) (Figs. 17 and 18)

Fig. 17
figure 17

3D segmented metal content (left) and the corresponding segmented junctions (right) of the bent designed PCB (created with Python 3.8 code).

Fig. 18
figure 18

Colored nets and named junctions of the bent designed PCB (created with Python 3.8 code).

Discussion on the integration of image processing and machine learning

The integration of image processing and machine learning in this work achieves full automation while ensuring computational efficiency and accuracy. Preprocessing steps, such as watershed segmentation, isolate regions of interest, reducing the computational burden for subsequent machine learning-based semantic segmentation. This pipeline achieves a high degree of automation, eliminating the need for manual intervention, which is common in existing PCB design reconstruction methods. As a result, the method is labor-efficient, error-free, and capable of extracting netlists with 100% accuracy.

Discussion of validation testing

To validate the segmentation accuracy and its practical utility, we not only evaluated performance using standard metrics like Intersection over Union (IoU) and F1 score but also conducted a comprehensive comparison of the extracted pseudo-netlist with the expected netlist derived from the known PCB design. This step directly assesses the method’s ability to produce functionally correct outputs. In our tests, the extracted netlist achieved a 100% match with the expected netlist, demonstrating the robustness and reliability of our approach in real-world applications.

It is important to note that the synthetic dataset, consisting of 168,750 sub-volumes from 250 synthetic PCBs, was used for training the deep learning model. In contrast, validation was performed on a set of independent cases, including a custom-designed 2-layer PCB, a bent version of the same PCB, and a complex commercial 3-layer PCB. These validation cases were selected to evaluate the method’s robustness across diverse scenarios and ensure its generalizability to real-world applications.

Discussion of scalability

The scalability of the proposed method is achieved through synthetic data generation, which eliminates the need for costly and time-intensive real-world datasets, and full automation, which minimizes manual intervention. The efficiency of direct 3D segmentation further enhances the method’s adaptability to diverse PCB configurations. The successful validation of the method on synthetic, designed, bent, and commercial PCBs highlights its robustness and suitability for various industrial scenarios.

Comparison with other methods

Tables 2 and 3 provide a comparison between the proposed method and other existing methods.

Table 2 Comparison between the proposed method and other existing 3D imaging methods.
Table 3 Comparison between the proposed 3D image semantic segmentation and netlist extraction method and conventional approaches based on stacking of 2D images.

Conclusion

This study introduces a groundbreaking automated method for the design reconstruction of PCBs utilizing 3D semantic segmentation of X-ray CT images and netlist extraction. By integrating advanced image processing techniques with machine learning algorithms, we have developed a robust and efficient approach to accurately segment copper traces and junctions within PCBs. Unlike previous methods, our approach does not rely on extensive labeled datasets, thanks to the use of inherently labeled synthetic data. Additionally, by performing direct 3D segmentation, our method significantly improves the ease, accuracy, robustness, and universality of the process, eliminating the need to fix the orientation of images.

This capability also allows for the effective processing of images of PCBs that have undergone bending, a common occurrence in PCBs with smaller thicknesses. The broader implications of this approach extend to various physical and biological sciences where 3D image segmentation is vital. Our approach has been validated on both synthetic and real-world PCB datasets, demonstrating high accuracy and reliability.

By improving the efficiency of PCB design reconstruction, the proposed automated approach mitigates issues related to part obsolescence, intellectual property recovery, and compliance. Future work will focus on further refining the model, expanding the dataset, and exploring additional applications within the area of 3D image analysis and interpretation.