Enhancing Code Retrieval through Deep Learning and Information Retrieval Fusion
- DOI
- 10.2991/978-94-6463-447-1_33How to use a DOI?
- Keywords
- code retrieval; information retrieval; code translation
- Abstract
Code retrieval is a widely used technique that can search for the most relevant code fragments based on developers’ natural language queries. Most of the existing work feeds the whole code directly into the deep learning model for training, and does not effectively utilize auxiliary information such as method name or input parameter. In fact, the auxiliary information in the code is intuitive, easy to obtain, and can be very helpful for the improvement of retrieval results. In this paper, we summarize the code information into two categories, implicit structural information and explicit auxiliary information. To make more rational use of these two types of information, this paper proposes a two-stage code retrieval model. In the first stage, deep learning is used to mine the implicit structural information in the code. We adopted an improved code translation mechanism to recall multiple code segments. In the second stage, information retrieval is used to mine the explicit auxiliary information in the code, achieving a re-ranking of the recall results from the first stage. We validated our method on the Java dataset of CodeSearchNet. The experimental results prove that our method is effective and achieves good results.
- Copyright
- © 2024 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Wenshuo Cheng AU - Jianbo Jiang AU - Junyu Lu PY - 2024 DA - 2024/07/14 TI - Enhancing Code Retrieval through Deep Learning and Information Retrieval Fusion BT - Proceedings of the 2024 3rd International Conference on Engineering Management and Information Science (EMIS 2024) PB - Atlantis Press SP - 299 EP - 306 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-447-1_33 DO - 10.2991/978-94-6463-447-1_33 ID - Cheng2024 ER -