Proceedings of the 2024 3rd International Conference on Engineering Management and Information Science (EMIS 2024)

Enhancing Code Retrieval through Deep Learning and Information Retrieval Fusion

Authors
Wenshuo Cheng1, 2, Jianbo Jiang1, 2, Junyu Lu3, *
1AHU-IAI AI Joint Laboratory, Anhui University, Hefei, China
2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
3School of Data Science, University of Science and Technology of China, Hefei, China
*Corresponding author. Email: lujunyu@mail.ustc.edu.cn
Corresponding Author
Junyu Lu
Available Online 14 July 2024.
DOI
10.2991/978-94-6463-447-1_33How to use a DOI?
Keywords
code retrieval; information retrieval; code translation
Abstract

Code retrieval is a widely used technique that can search for the most relevant code fragments based on developers’ natural language queries. Most of the existing work feeds the whole code directly into the deep learning model for training, and does not effectively utilize auxiliary information such as method name or input parameter. In fact, the auxiliary information in the code is intuitive, easy to obtain, and can be very helpful for the improvement of retrieval results. In this paper, we summarize the code information into two categories, implicit structural information and explicit auxiliary information. To make more rational use of these two types of information, this paper proposes a two-stage code retrieval model. In the first stage, deep learning is used to mine the implicit structural information in the code. We adopted an improved code translation mechanism to recall multiple code segments. In the second stage, information retrieval is used to mine the explicit auxiliary information in the code, achieving a re-ranking of the recall results from the first stage. We validated our method on the Java dataset of CodeSearchNet. The experimental results prove that our method is effective and achieves good results.

Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2024 3rd International Conference on Engineering Management and Information Science (EMIS 2024)
Series
Advances in Computer Science Research
Publication Date
14 July 2024
ISBN
978-94-6463-447-1
ISSN
2352-538X
DOI
10.2991/978-94-6463-447-1_33How to use a DOI?
Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Wenshuo Cheng
AU  - Jianbo Jiang
AU  - Junyu Lu
PY  - 2024
DA  - 2024/07/14
TI  - Enhancing Code Retrieval through Deep Learning and Information Retrieval Fusion
BT  - Proceedings of the 2024 3rd International Conference on Engineering Management and Information Science (EMIS 2024)
PB  - Atlantis Press
SP  - 299
EP  - 306
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-447-1_33
DO  - 10.2991/978-94-6463-447-1_33
ID  - Cheng2024
ER  -