Skip to main content

Spiking Wavelet Transformer

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

Spiking neural networks (SNNs) offer an energy-efficient alternative to conventional deep learning by emulating the event-driven processing manner of the brain. Incorporating Transformers with SNNs has shown promise for accuracy. However, they struggle to learn high-frequency patterns, such as moving edges and pixel-level brightness changes, because they rely on the global self-attention mechanism. Learning these high-frequency representations is challenging but essential for SNN-based event-driven vision. To address this issue, we propose the Spiking Wavelet Transformer (SWformer), an attention-free architecture that effectively learns comprehensive spatial-frequency features in a spike-driven manner by leveraging the sparse wavelet transform. The critical component is a Frequency-Aware Token Mixer (FATM) with three branches: 1) spiking wavelet learner for spatial-frequency domain learning, 2) convolution-based learner for spatial feature extraction, and 3) spiking pointwise convolution for cross-channel information aggregation - with negative spike dynamics incorporated in 1) to enhance frequency representation. The FATM enables the SWformer to outperform vanilla Spiking Transformers in capturing high-frequency visual components, as evidenced by our empirical results. Experiments on both static and neuromorphic datasets demonstrate SWformer’s effectiveness in capturing spatial-frequency patterns in a multiplication-free and event-driven fashion, outperforming state-of-the-art SNNs. SWformer achieves a 22.03% reduction in parameter count, and a 2.52% performance improvement on the ImageNet dataset compared to vanilla Spiking Transformers. The code is available at: https://github.com/bic-L/Spiking-Wavelet-Transformer.

Y. Fang and Z. Wang—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Auge, D., Mueller, E.: Resonate-and-fire neurons as frequency selective input encoders for spiking neural networks (2020)

    Google Scholar 

  2. Basu, A., Deng, L., Frenkel, C., Zhang, X.: Spiking neural network integrated circuits: a review of trends and future directions. In: 2022 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–8. IEEE (2022)

    Google Scholar 

  3. Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., Maass, W.: Long short-term memory and learning-to-learn in networks of spiking neurons. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  4. Bi, Y., Chadha, A., Abbas, A., Bourtsoulatze, E., Andreopoulos, Y.: Graph-based object classification for neuromorphic vision sensing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 491–501 (2019)

    Google Scholar 

  5. Bochner, S., Chandrasekharan, K.: Fourier transforms, No. 19. Princeton University Press (1949)

    Google Scholar 

  6. Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2021)

    Google Scholar 

  7. Burkitt, A.N.: A review of the integrate-and-fire neuron model: I. homogeneous synaptic input. Biol. Cybern. 95, 1–19 (2006)

    Google Scholar 

  8. Cao, Y., Chen, Y., Khosla, D.: Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comput. Vision 113(1), 54–66 (2015)

    Article  MathSciNet  Google Scholar 

  9. Chen, S., Ye, T., Bai, J., Chen, E., Shi, J., Zhu, L.: Sparse sampling transformer with uncertainty-driven ranking for unified removal of raindrops and rain streaks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13106–13117 (2023)

    Google Scholar 

  10. Chen, S., et al.: MSP-former: Multi-scale projection transformer for single image desnowing. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023)

    Google Scholar 

  11. Dao, T., et al.: Monarch: expressive structured matrices for efficient and accurate training. In: International Conference on Machine Learning, pp. 4690–4721. PMLR (2022)

    Google Scholar 

  12. Davies, M., et al.: Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018)

    Article  Google Scholar 

  13. Davies, M., et al.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proc. IEEE 109(5), 911–934 (2021)

    Article  Google Scholar 

  14. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  15. Deng, S., Li, Y., Zhang, S., Gu, S.: Temporal efficient training of spiking neural network via gradient re-weighting. arXiv preprint arXiv:2202.11946 (2022)

  16. Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ANN-SNN conversion for fast and accurate inference in deep spiking neural networks. arXiv preprint arXiv:2105.11654 (2021)

  17. Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Advances in Neural Information Processing Systems, vol. 35, pp. 34377–34390 (2022)

    Google Scholar 

  18. Fang, W., Yu, Z., Chen, Y., Huang, T., Masquelier, T., Tian, Y.: Deep residual learning in spiking neural networks. In: Advances in Neural Information Processing Systems, vol. 34, pp. 21056–21069 (2021)

    Google Scholar 

  19. Frady, E.P., et al.: Efficient neuromorphic signal processing with resonator neurons. J. Signal Process. Syst. 94(10), 917–927 (2022)

    Article  Google Scholar 

  20. Gaudart, L., Crebassa, J., Petrakian, J.P.: Wavelet transform in human visual channels. Appl. Opt. 32(22), 4119–4127 (1993)

    Google Scholar 

  21. Gu, P., Xiao, R., Pan, G., Tang, H.: STCA: spatio-temporal credit assignment with delayed feedback in deep spiking neural networks. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. pp. 1366–1372. International Joint Conferences on Artificial Intelligence Organization, Macao, China (2019). https://doi.org/10.24963/ijcai.2019/189

  22. Guo, Y., Zhang, L., Chen, Y., Tong, X., Liu, X., Wang, Y., Huang, X., Ma, Z.: Real spike: Learning real-valued spikes for spiking neural networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) European Conference on Computer Vision, pp. 52–68. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19775-8_4

  23. He, C., et al.: Camouflaged object detection with feature decomposition and edge reconstruction. In: CVPR, pp. 22046–22055 (2023)

    Google Scholar 

  24. He, C., Li, K., Zhang, Y., Xu, G., Tang, L.: Weakly-supervised concealed object segmentation with SAM-based pseudo labeling and multi-scale feature grouping. In: NeurIPS (2024)

    Google Scholar 

  25. He, C., et al.: Diffusion models in low-level vision: a survey. arXiv preprint arXiv:2406.11138 (2024)

  26. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  27. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  28. Hopkins, M., Pineda-Garcia, G., Bogdan, P.A., Furber, S.B.: Spiking neural networks for computer vision. Interface Focus 8(4), 20180007 (2018)

    Article  Google Scholar 

  29. Hu, Y., Tang, H., Pan, G.: Spiking deep residual networks. IEEE Trans. Neural Netw. Learn. Syst. 34(8), 5200–5205 (2021)

    Article  Google Scholar 

  30. Hu, Y., Deng, L., Wu, Y., Yao, M., Li, G.: Advancing spiking neural networks towards deep residual learning. arXiv preprint arXiv:2112.08954 (2021)

  31. Hu, Y., Deng, L., Wu, Y., Yao, M., Li, G.: Advancing spiking neural networks toward deep residual learning. IEEE Trans. Neural Netw. Learn. Syst. (2024)

    Google Scholar 

  32. Ji, M., Wang, Z., Yan, R., Liu, Q., Xu, S., Tang, H.: SCTN: event-based object tracking with energy-efficient deep convolutional spiking neural networks. Front. Neurosci. 17, 1123698 (2023)

    Article  Google Scholar 

  33. Jiménez-Fernández, A., et al.: A binaural neuromorphic auditory sensor for FPGA: a spike signal processing approach. IEEE Trans. Neural Netw. Learn. Syst. 28(4), 804–818 (2016)

    Article  Google Scholar 

  34. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  35. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791

  36. Lee, D., Yin, R., Kim, Y., Moitra, A., Li, Y., Panda, P.: TT-SNN: tensor train decomposition for efficient spiking neural network training. arXiv preprint arXiv:2401.08001 (2024)

  37. Lee, I., Kim, J., Kim, Y., Kim, S., Park, G., Park, K.T.: Wavelet transform image coding using human visual system. In: Proceedings of APCCAS’94-1994 Asia Pacific Conference on Circuits and Systems, pp. 619–623. IEEE (1994)

    Google Scholar 

  38. Li, H., Liu, H., Ji, X., Li, G., Shi, L.: CIFAR10-DVS: an event-stream dataset for object classification. Front. Neurosci. 11 (2017)

    Google Scholar 

  39. Li, Q., Shen, L., Guo, S., Lai, Z.: Wavelet integrated CNNs for noise-robust image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7245–7254 (2020)

    Google Scholar 

  40. Li, Y., Kim, Y., Park, H., Geller, T., Panda, P.: Neuromorphic data augmentation for training spiking neural networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) European Conference on Computer Vision, pp. 631–649. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20071-7_37

  41. Li, Y., Kim, Y., Park, H., Geller, T., Panda, P.: Neuromorphic data augmentation for training spiking neural networks. arXiv preprint arXiv:2203.06145 (2022)

  42. Liu, Q., Xing, D., Tang, H., Ma, D., Pan, G.: Event-based action recognition using motion information and spiking neural networks. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, pp. 1743–1749. International Joint Conferences on Artificial Intelligence Organization, Montreal, Canada (2021). https://doi.org/10.24963/ijcai.2021/240

  43. López-Randulfe, J., Duswald, T., Bing, Z., Knoll, A.: Spiking neural network for Fourier transform and object detection for automotive radar. Front. Neurorobot. 15, 688344 (2021)

    Article  Google Scholar 

  44. López-Randulfe, J., et al.: Time-coded spiking Fourier transform in neuromorphic hardware. IEEE Trans. Comput. 71(11), 2792–2802 (2022)

    Article  Google Scholar 

  45. Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10(9), 1659–1671 (1997)

    Article  Google Scholar 

  46. Maro, J.M., Ieng, S.H., Benosman, R.: Event-based gesture recognition with dynamic background suppression using smartphone computational capabilities. Front. Neurosci. 14, 275 (2020)

    Article  Google Scholar 

  47. Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12444–12453 (2022)

    Google Scholar 

  48. Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Netw. 153, 254–268 (2022)

    Article  Google Scholar 

  49. Merolla, P.A., et al.: A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345(6197), 668–673 (2014)

    Article  Google Scholar 

  50. Miao, S., et al.: Neuromorphic vision datasets for pedestrian detection, action recognition, and fall detection. Front. Neurorobot. 13, 38 (2019)

    Article  Google Scholar 

  51. Orchard, G., Jayawant, A., Cohen, G.K., Thakor, N.: Converting static image datasets to spiking neuromorphic datasets using saccades. Front. Neuroscience 9 (2015)

    Google Scholar 

  52. Park, N., Kim, S.: How do vision transformers work? arXiv preprint arXiv:2202.06709 (2022)

  53. Pei, J., et al.: Towards artificial general intelligence with hybrid tianjic chip architecture. Nature 572(7767), 106–111 (2019)

    Article  Google Scholar 

  54. Rao, A., Plank, P., Wild, A., Maass, W.: A long short-term memory for AI applications in spike-based neuromorphic hardware. Nature Mach. Intell. 4(5), 467–479 (2022)

    Article  Google Scholar 

  55. Rathi, N., et al.: Exploring neuromorphic computing based on spiking neural networks: algorithms to hardware. ACM Comput. Surv. 55(12), 1–49 (2023)

    Article  Google Scholar 

  56. Rathi, N., Srinivasan, G., Panda, P., Roy, K.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020)

  57. Roy, K., Jaiswal, A., Panda, P.: Towards spike-based machine intelligence with neuromorphic computing. Nature 575(7784), 607–617 (2019)

    Article  Google Scholar 

  58. Schuman, C.D., Kulkarni, S.R., Parsa, M., Mitchell, J.P., Kay, B., et al.: Opportunities for neuromorphic computing algorithms and applications. Nature Comput. Sci. 2(1), 10–19 (2022)

    Article  Google Scholar 

  59. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  60. Shen, S., Zhao, D., Shen, G., Zeng, Y.: TIM: an efficient temporal interaction module for spiking transformer. arXiv preprint arXiv:2401.11687 (2024)

  61. Si, C., Yu, W., Zhou, P., Zhou, Y., Wang, X., Yan, S.: Inception transformer. In: Advances in Neural Information Processing Systems, vol. 35, pp. 23495–23509 (2022)

    Google Scholar 

  62. Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., Benosman, R.: HATS: histograms of averaged time surfaces for robust event-based object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1731–1740 (2018)

    Google Scholar 

  63. Stewart, K.M., Neftci, E.O.: Meta-learning spiking neural networks with surrogate gradient descent. Neuromorphic Comput. Eng. 2(4), 044002 (2022)

    Article  Google Scholar 

  64. Su, Q., et al.: Deep directly-trained spiking neural networks for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6555–6565 (2023)

    Google Scholar 

  65. Tripura, T., Chakraborty, S.: Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems. Comput. Methods Appl. Mech. Eng. 404, 115783 (2023)

    Article  MathSciNet  Google Scholar 

  66. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  67. Viale, A., Marchisio, A., Martina, M., Masera, G., Shafique, M.: CarSNN: an efficient spiking neural network for event-based autonomous cars on the Loihi neuromorphic research processor. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–10. IEEE (2021)

    Google Scholar 

  68. Wang, Z., Fang, Y., Cao, J., Xu, R.: Bursting spikes: efficient and high-performance SNNs for event-based vision. arXiv preprint arXiv:2311.14265 (2023)

  69. Wang, Z., Fang, Y., Cao, J., Zhang, Q., Wang, Z., Xu, R.: Masked spiking transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1761–1771 (2023)

    Google Scholar 

  70. Wu, H., Yang, Y., Chen, H., Ren, J., Zhu, L.: Mask-guided progressive network for joint raindrop and rain streak removal in videos. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 7216–7225 (2023)

    Google Scholar 

  71. Yang, Y., Wu, H., Aviles-Rivero, A.I., Zhang, Y., Qin, J., Zhu, L.: Genuine knowledge from practice: diffusion test-time adaptation for video adverse weather removal. arXiv preprint arXiv:2403.07684 (2024)

  72. Yang, Z., et al.: DashNet: a hybrid artificial and spiking neural network for high-speed object tracking. arXiv preprint arXiv:1909.12942 (2019)

  73. Yao, M., Gao, H., Zhao, G., Wang, D., Lin, Y., Yang, Z., Li, G.: Temporal-wise attention spiking neural networks for event streams classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10221–10230 (2021)

    Google Scholar 

  74. Yao, M., Hu, J., Zhou, Z., Yuan, L., Tian, Y., Xu, B., Li, G.: Spike-driven transformer. arXiv preprint arXiv:2307.01694 (2023)

  75. Yao, M., et al.: Attention spiking neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 9393–9410 (2023)

    Article  Google Scholar 

  76. Ye, C., Kornijcuk, V., Yoo, D., Kim, J., Jeong, D.S.: LaCERA: layer-centric event-routing architecture. Neurocomputing 520, 46–59 (2023)

    Article  Google Scholar 

  77. Ye, T., Zhang, Y., Jiang, M., Chen, L., Liu, Y., Chen, S., Chen, E.: Perceiving and modeling density for image dehazing. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) European Conference on Computer Vision, pp. 130–145. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19800-7_8

  78. Zhang, J., et al.: Spiking transformers for event-based single object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8801–8810 (2022)

    Google Scholar 

  79. Zheng, H., Wu, Y., Deng, L., Hu, Y., Li, G.: Going deeper with directly-trained larger spiking neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11062–11070 (2021)

    Google Scholar 

  80. Zhou, C., et al.: Spikingformer: spike-driven residual learning for transformer-based spiking neural network. arXiv preprint arXiv:2304.11954 (2023)

  81. Zhou, Z., et al.: Spikformer: when spiking neural network meets transformer. arXiv preprint arXiv:2209.15425 (2022)

  82. Zhu, R.J., Wang, Z., Gilpin, L., Eshraghian, J.K.: Autonomous driving with spiking neural networks. arXiv preprint arXiv:2405.19687 (2024)

Download references

Acknowledgements

This work is supported by the Guangzhou-HKUST(GZ) Joint Funding Program (Grant No. 2023A03J0682) and partially supported by a collaborative project with Brain Mind Innovation, inc. Special thanks to Mr. Yijian He.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Renjing Xu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1268 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fang, Y., Wang, Z., Zhang, L., Cao, J., Chen, H., Xu, R. (2025). Spiking Wavelet Transformer. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15134. Springer, Cham. https://doi.org/10.1007/978-3-031-73116-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73116-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73115-0

  • Online ISBN: 978-3-031-73116-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics