A Survey of Information Retrieval Techniques
Advances in Networks
Volume 5, Issue 2, November 2017, Pages: 40-46
Received: Jun. 25, 2017; Accepted: Jul. 10, 2017; Published: Nov. 28, 2017
Views 1716      Downloads 171
Authors
Mang’are Fridah Nyamisa, Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya
Waweru Mwangi, Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya
Wilson Cheruiyot, Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya
Article Tools
Follow on us
Abstract
The explosive growth of resources stored in various forms and transmitted over the internet has necessitated researches into information retrieval technologies. The major information retrieval mechanisms commonly employed include vector space model, Boolean model, Fuzzy Set model, and probabilistic retrieval model. These models are used to find similarities between the query and the documents to retrieve documents that reflect the query. These approaches are based on key-word, which uses lists of keywords to describe the information content. In this paper, a survey of these models is provided in order to understand their working mechanisms and shortcomings. This understanding is vital as it facilitates the choice of an information retrieval technique, based on the underlying requirements. The results of this survey revealed that the current information retrieval models fall short of the expectations in one way or the other. As such, they are not ideal for high precision information retrieval applications.
Keywords
Information Retrieval, Model, Fuzzy, Boolean, Probabilistic, Query
To cite this article
Mang’are Fridah Nyamisa, Waweru Mwangi, Wilson Cheruiyot, A Survey of Information Retrieval Techniques, Advances in Networks. Vol. 5, No. 2, 2017, pp. 40-46. doi: 10.11648/j.net.20170502.12
Copyright
Copyright © 2017 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
References
[1]
B. Jansen and S. Rieh (2010). The Seventeen Theoretical Constructs of Information Searching and Information Retrieval. Journal of the American Society for Information Sciences and Technology. 61(8), 1517-1534.
[2]
I. Sutskever, O. Vinyals and Q. Le (2014). Sequence to Sequence Learning with Neural Networks.
[3]
M. Sanderson and W. Bruce (2012). The History of Information Retrieval Research. Proceedings of the IEEE. 100: 1444–1451.
[4]
R. Baeza, and B. Ribeiro (2011). Modern Information Retrieval: Second edition. Addison-Wesley, New York, NY, USA.
[5]
E. Elabd, E. Alshari, and H. Abdulkader (2014). Semantic Boolean Arabic Information Retrieval. The International Arab Journal of Information Technology.
[6]
Q. Shatnawi B. Yassein B. and R. Mahafza (2012). A Framework for Retrieving Arabic Documents Based on Queries Written in Arabic Slang Language. Journal of Information Science, vol. 38, pp. 350-365.
[7]
Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil (2014). A Latent Semantic Model with Convolutional-pooling Structure for Information Retrieval. In Proceedings of CIKM.
[8]
R. Harastani (2010). Information Retrieval With Fuzzy Logic. Texmex.
[9]
W. Onifade and J. Ibitoye (2016). Fuzzy Latent Semantic Query Expansion Model for Enhancing Information Retrieval. University of Ibadan, Nigeria.
[10]
B. Yates and R. Neto (2012). Modern information retrieval. Addison Wesley, 2011.
[11]
D. Turney, and P. Pantel (2010). From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research.
[12]
N. Singh andK. Dwivedi (2012). Analysis of Vector Space Model in Information Retrieval. National Conference on Communication Technologies & its impact on Next Generation Computing.
[13]
R. Kiros, Y. Zhu, R. Salakhutdinov, S. Zemel, A. Torralba, R. Urtasun, and S. Fidler (2015). Skip-thought vectors.
[14]
R. Pascanu, C. Culcehre, K. Cho, and Y. Bengio, (2013). How to Construct Deep Neural Networks.
[15]
M. Dragoni, Celia da Costa Pereira, G. B Andrea. Tettamanzi, (2012). A Conceptual Representation of Documents and Queries for Information Retrieval System using Light Ontologies. Expert Systems with Applications pp. 10376–10388, Elsevier.
[16]
C. Exeler and H. Sack (2015). Linked Data Enabled Generalized Vector Space Model To Improve Document Retrieval. Hasso-Plattner-Institute for IT-Systems Engineering.
[17]
R. Usbeck (2015). GERBIL: general entity annotation benchmark framework. In 24th WWW conference.
[18]
T. Tietz, J. Waitelonis, J. Jager, and H. Sack (2014). Smart media navigator: Visualizing recommendations based on linked data. In 13th International Semantic Web Conference, Industry Track, pages 48{51}.
[19]
I. Santos, B. Sanz C. Laorden and G. Bringas (2012). Enhanced Topic-based Vector Space Model for semantics-aware spam filtering. Expert Systems with Applications 39:437-444.
[20]
H. Drucker (2013). Support Vector Machines for Spam Categorization.
[21]
M. Kwak and G. Leroy (2013). Development and Evaluation of a Biomedical Search Engine using a Predicate-based Vector Space Model.
[22]
S. Clark (2013). Topic Modelling and Latent Dirichlet Allocation. Machine Learning for Language Processing.
[23]
D. Blei (2012). Probabilistic topic models. Communications of the ACM, 55(4):7784.
[24]
S. Liangcai B. Long, M. Weiyi (2014). A Latent Topic Model for Complete Entity Resolution. 25th IEEE International Conference on Data Engineering.
[25]
B. Stefan L. Charles V. Gordon (2014). Information Retrieval: Implementing and Evaluating Search Engines. MIT Press.
[26]
D. Manning P. Raghavan S. Hinrich (2013). Introduction to Information Retrieval. Cambridge University Press.
[27]
H. Paik, (2013). A novel TF-IDF weighting scheme for effective ranking. Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, Dublin, Ireland.
[28]
R. Cummins, H. Jiaul, L. Yuanhua, A. Pólya (2015). Urn Document Language Model for Improved Information Retrieval. ACM Transactions on Information Systems (TOIS), v.33 n.4, p.1-34.
[29]
P. Sojka and H. Schütze (2015). Introduction to Information Retrieval. Faculty of Informatics, Masaryk University.
[30]
Y. Baeza, R. Ribeiro (2011). Modern Information Retrieval.
[31]
Y. Kim, Y. Jernite, D. Sontag, M. Rush (2016). Character-Aware Neural Language Models. School of Engineering and Applied Sciences Harvard University.
[32]
P. Wise, M. Henrion (2013). A Framework for Comparing Uncertain Inference Systems to Probability. Cornell University Library.
[33]
E. Kyburgand, C. Teng (2015). Uncertain Inference.
[34]
S. Zhang, H. Jiang, M. Xu, J. Hou, and L. Dai (2015). The Fixed- Size Ordinally-Forgetting Encoding Method for Neural Network Language Models. In Proceedings of ACL.
[35]
T. Mikolov, A. Deoras, S. Kombrink, L. Burget, and J. Cernocky (2011). Empirical Evaluation and Combination of Advanced Language Modeling Techniques. In Proceedings of INTERSPEECH.
[36]
M. Sundermeyer, H. Ney, and R. Schluter (2015). From feedforward to recurrent lstm neural networks for language modeling. Audio, Speech, and Language Processing, IEEE/ACM Transactions on 23(3):517–529.
[37]
S. Goldwater (2015). Introduction to Computational Linguistics: N-gram language models.
[38]
D. Matthew(2012). Adadelta: An adaptive learning rate method.
[39]
G. Amati (2015). Divergence from Randomness Models.
[40]
S. Hinrich (2011). Introduction to Information Retrieval. Institute for Natural Language Processing, Universit¨at Stuttgart.
ADDRESS
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
U.S.A.
Tel: (001)347-983-5186