SOFT-AlignUNet: A Lightweight Transformer with Feature Alignment

WU Rui-Jia; ZHANG Hong-Gang

Chinese︱Feedback︱Save this page

• Elaborating Academic Views 　　　　 • Exchanging Innovative Ideas
• Protecting Intellectual Properties 　　• Fast Sharing Science Papers

Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China

Home > Papers

SOFT-AlignUNet: A Lightweight Transformer with Feature Alignment

WU Rui-Jia,ZHANG Hong-Gang *

School of Articial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876

*Correspondence author

#Submitted by

Subject:

Funding: none

Opened online:11 January 2022

Accepted by: none

Citation: WU Rui-Jia,ZHANG Hong-Gang.SOFT-AlignUNet: A Lightweight Transformer with Feature Alignment[OL]. [11 January 2022] http://en.paper.edu.cn/en_releasepaper/content/4755984

Transformer, the prevalent backbone architecture in natural language processing, has been adopted in various vision tasks since the proposition of vision transformer. The performance of transformer has been proved to be almost the same as CNN's and even be better with large enough dataset. However, the initial vision transformer suffered from the straightforward structure, which requires large parameters and expensive computation cost, especially in the dense prediction task. This paper concentrates on medical image semantic segmentation task. In medical scene, UNet is always the popular backbone and many researchers proposed transformer-CNN or pure transformer UNet model recently. But the inherent feature misalignment caused by resizing feature maps and concatenation is still lack of focus. In this paper, a lightweight transformer-CNN hybrid UNet, SOFT-AlignUNet (SOFT-AU) , is proposed to solve above issues. On one hand, a novel softmax-free transformer, which reduces the calculation cost to be linear to the patch number, is introduced into UNet architecture to alleviate the computation cost at a large extent. On the other hand, the feature misalignment is taken into consideration and a river-like Feature Alignment Flow is proposed to generate spatial deviation and correct the features. The architecture achieves strongly competitive results on public Synapse and DRIVE dataset with pretty light model size and computation requirement. The results show that this is a pretty promising network for future deployment in reality.

Keywords:computer technology; vision transformer; medical image semantic segmentation; feature alignment; lightweight

For this paper

● PDF (0B)
● Revision 0 　　
● Print this paper
● Recommend this paper to a friend
● Add to my favorite list

Saved Papers

Please enter a name for this paper to be shown in your personalized Saved Papers list

Tags

Add yours

Related Papers

Statistics

PDF Downloaded	10
Bookmarked	0
Recommend	0
Comments	Array

Submit your papers

Alert Name:
Alerting to:
Authentication email will be sent to your email address in 24 hours
Frequency:
Email Message Format:	Plain text Graphical(HTML)

Complete the form below and we will recommend the selected titles to your friends on your behalf. * Indicates a required field.
Your name*:
Your email address*:
Recipient's name*:
Recipient's email address*:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find the page(s) useful.

Your name:
Your email address:
Recipient's name:
Recipient's email address:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find this page useful.

Disclaimer: This message was sent to your friend using the "Send it to a friend" facility on the Sciencepaper Online’ WWW site, http://www.paper.edu.cn/en. The Sciencepaper Online is not responsible for the content of this email, and anything said in this email does not necessarily reflect the Sciencepaper Online's views.

	Check out RSS, or use RSS reader to subscribe this item

Saved Papers