CNN-based Symbol Recognition and Detection in Piping Drawings
Piping is an essential component in buildings, and its as-built information is critical to facility management tasks. Manually extracting piping information from legacy drawings that are in paper, PDF, or image format is mentally exerting, time-consuming, and error-prone. Symbol recognition and detection are core problems in the computer-based interpretation of piping drawings, and the main technical challenge is to determine robust features that are invariant to scaling, rotation, and translation. This thesis aims to use convolutional neural networks (CNNs) to automatically extract features from raw images, and consequently, to locate and recognize symbols in piping drawings.
In this thesis, the Spatial Transformer Network (STN) is applied to improve the performance of a standard CNN model for recognizing piping symbols, and the Faster Region-based Convolutional Neural Network (Faster RCNN) is adopted to exploit its capacity in symbol detection. For experimentation, the synthetic data are generated as follows. Two datasets are generated for symbol recognition and detection, respectively. For recognition, eight types of symbols are synthesized based on the geometric constraints between the primitives. The drawing samples for detection are manually sketched using AutoCAD MEP software and its piping component library, and seven types of symbols are selected from the piping component library. Both sets of samples are augmented with various scales, rotations, and random noises.
The experiment for symbol recognition is conducted and the accuracies of the recognition accuracy of the CNN + STN model and the standard CNN model are compared. It is observed that the spatial transformer layer improves the accuracy in classifying piping symbols from 95.39% to 98.26%. For the symbol detection task, the experiment is conducted using a public implementation of Faster RCNN. The mean Average Precision (mAP) is 82.8% when Intersection over Union (IoU) threshold equals to 0.5. Imbalanced data (i.e., imbalanced samples in each class) led to a decrease in the Average Precision in the minority class. Also, the symbol library, the small dataset, and the complex backbone network limit the generality of the model. Future work will focus on the collection of larger set of drawings and the improvement of the network’s geometric invariance.