The use of optical neural networks in AI is an attractive idea that has long stimulated important work. The potential benefits include low power, high speed, and the ability to handle greater complexity. However, achieving this ideal is difficult, especially because manufacturing imperfections can degrade accuracy. Intel proposed a directional approach to mitigate this problem in a blog post last week and a post earlier this month.
“We considered two architectures to build an optical neural network engine from MZI (Mach-Zehnder inferometer). One of them, which we called GridNet, organizes the MZIs in a grid; the other, which we called FFTNet, organizes the MZIs in a butterfly-shaped model modeled after fast Fourier transform computational architectures (but in our case the weights are learned from the data, so the calculation will not, in general, be a true FFT). We then trained these two architectures in a software simulation on a Handwritten Digit Recognition Reference Deep Learning (MNIST) task.
“We found that with double precision floating point precision, GridNet achieved higher precision than FFTNet (~ 98% vs. ~ 95%). However, we found that FFTNet was significantly more resistant to manufacturing imprecision, which we simulated by adding noise to the amount of phase shift and transmittance of each MZI. After adjusting these noise levels to realistic levels, the performance of GridNet fell below 50% while that of FFTNet remained almost constant.
“If NSBs are to become a viable part of the AI hardware ecosystem, they will need to evolve into larger circuits and industrial manufacturing techniques. Our conclusion addresses both of these issues. Larger circuits will require more devices, such as MZIs, per chip. Therefore, attempting to “fine-tune” each device on a chip after manufacturing will be an increasing challenge. A more scalable strategy will be to train NSBs in software and then mass-produce circuits based on those parameters. Our results suggest that choosing the right architecture in advance can dramatically increase the likelihood that the resulting circuits will achieve the desired performance, even in the face of manufacturing variations.
You will find much more details in the paper (Design of optical neural networks with component inaccuracies) published in Optics Express. Wierzynski was one of the authors along with colleagues at Intel and researchers at UC Berkeley.
The researchers write in their conclusion: “[Our results] provide clear guidelines for the architectural design of efficient and fault-resistant NSBs. In the future, it would be important to also study algorithmic and training strategies. A central problem in deep learning is designing neural networks that are complex enough to model the data while being regularized to avoid overfitting the noise in the training set. To this end, a wide variety of regularization techniques such as Dropout, Dropconnect, data augmentation, etc. have been developed. This problem parallels the trade-off between the expressiveness of an NSB and its robustness to imprecision presented here. Indeed, an important conclusion is that in addition to the architecture, even minor changes in the configuration of the ONNs also have a large effect on the robustness of the network to faulty components.
Link to the Intel AI blog: https://www.intel.ai/optical-neural-networks/#gs.fe8yao
Link to the paper: https://www.osapublishing.org/oe/fulltext.cfm?uri=oe-27-10-14009&id=411885