Research Journal of Recent Sciences _________________________________________________ ISSN 2277-2502 Vol. 4(ISC-2014), 19-23 (2015) Res. J. Recent. Sci. International Science Congress Association 19 Review Paper Under the section of Computer and Information Technology Sciences, Image Retrieval- an Overview Madhu Singh* Department of Computer Science, Indian School of Mines, Dhanbad-826004, INDIAAvailable online at: www.isca.in, www.isca.me Received 15th October 2014, revised 5th February 2015, accepted 5th April 2015 AbstractIn the present scenario, there is a big amount of data occupy a large space on the Web with the acquaintance of the Internet and digital accessories. The number of image libraries are growing rapidly by inducing the need for the effective and efficient tools to query these large databases. Therefore, it become necessary for retrieval search engines to retrieve relevant documents and images from large database. This paper attempts to provide an extensive review over the image retrieval. Recent studies are included in this review article covering different aspects and researches in this area. Various techniques of image retrieval are discussed based on existing technologies and the demand from real-world applications. This article demonstrate a sight of most popular image retrieval techniques with their advantages and disadvantages. Keywords: Retrieval, image, image database, query, annotation. Introduction An image represents the real object or scene. An immense amount of digital images, multimedia data files, and visual objects are being created and used every day due to availability of digital cameras and internet in different areas including remote sensing, fashion, engineering, science, history, advertising, crime prevention, medicine, architecture etc.1-3. According to the recent study, there are more than 180 million images on the webdatabase. A large amount of image data of about 3Tb [terabytes], and a staggering one million or more image data are produced every day. In the present world scenario, the technology is growing so fast because of internet, so we need to manage large network database for the retrieval. Image retrieval systems incorporates browsing, searching and retrieving the images from a large collection of image databases. Search engines are the most powerful resources for finding visual contents from the World Wide Web. These search engines use the surrounding text near the image for describing the content of an image and rely on text retrieval techniques for searching particular image. In an image retrieval process, user generates a query as images, text as a keyword (s), and image links, then the retrieval systemsearch and retrieve the images “similar” to query. The image retrieval is also largely restrained by some other factors like dissimilarity of user base and retrieval time. Beside this, search data can be divided up as follows: archives, domain specific collection, enterprise collection, personnel collection etc.. Image retrieval has been an exceedingly active research area over the last 30 years. The review articles from various years discussed about the state-of-the-art of the image retrieval of that corresponding years and descriptions of the technologies implemented. Enser et al.,10reported abroad description of image database, various indexing methods and common browsing and searching tasks, using primarily text-based searches on annotated images. This review paper present a systematic overview of image retrieval techniquesused up to now for the effective retrieval from large inage databases. Basic Idea of Image Retrieval The General target of image retrieval systems are: i. System must be able to process language query, ii. Search must be performed among all image database and considers human visual perception, iii. System must take account of all the features of image. The image can be automatically indexed by summarizing their visual features in image retrieval systems. A feature is one of the important characteristic which capture a certain visual property of an image either globally for the entire image or locally for region or objects. Color, texture, and shape are commonly used features in the retrieval system. Mapping the image pixels in to the feature space is known as feature extraction. Extracted features are used to represent images for searching, indexing, and browsing images in an image database11. Approaches for Image RetrievalMost traditional and common methods of image retrieval utilize some method of adding metadata such as captioning, keywords, or descriptions to the images so that the retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious, and expensive. To address this, there has been a large amount of research done on automatic Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 4(ISC-2014), 19-23 (2015) Res. J. Recent. Sci. International Science Congress Association 20 image annotation. For many years researchers has been working on image retrieval processes. The three methods or systems which are used for image retrieval are: i. Text-based image retrieval, ii. Content-based image retrieval, iii. Hybrid approaches12. Text –Based Image Retrieval [TBIR]: TBIR is currently used in all general-purpose web image retrieval system today. As shown in the figure 1 this approach utilizes the text associated with an image to determine what the image contain. This text can be text surrounding the image, the image’s filename, a hyperlink leading to the image, an annotation to the image, or any other piece of text that can be associated with the image13. The search engines like Google, yahoo, Bing are the examples of the retrieval systems using TBIR. Over one billion images have been indexed by these search engines14. Key-based indexing has many advantages which includes the ability to represent both general and specific instantiations of an object at varying level of complexity15. In the past era,access to image collections was provided by librarians and archivists through the text descriptions or classification codes that could be digitized. Several attempts are made to provide general system for image indexing that include the Getty’s Art and Architecture (AAT), which comprisesmore than 120,000 terms for description of art, architecture, and other ethnic objects, and the Library of Congress Thesaurus of Graphic Material (LCTGM). The AAT currently providing access to a number of hierarchical categories of image description using seven broad facts (Associated Concepts, Physical Attributes, Styles and Periods, Agents, Materials, and Objects). Textual representation of image is problematic because image transmit the relevant information relating to what is actually pictured in the image as well as what image is all about. Shatford 16 postulated this discussion with a framework based on Panofsky’s approach to analyzing iconographical level of meaning in image database. Shatford-Layne17 extended this discussion by providing a theoretical model for analyzing the subject of an image and suggested that it might be necessary to determine the relevance of attributes that would result in useful grouping of images and should be left to the users to identify. Turner et al.18 extended this model by analyzing the terms assign to both still and moving images by groups with the goal of fetching appropriate ways to index images. Manual assignment of textual attributes is a big issue related to TBIR that is both time-consuming and costly. Manual indexing face the problem between indexes and user queries10, 19 and also from the low term agreement across indexes20. The textual attributes have been automatically assigned using verbal description for the blind, which attached to many videos18. The representation of these attributes may be very relevant if represented by image exemplars and retrieved by systems performing pattern matches based on color, texture, shape, and other visual features. The main advantages and disadvantages of TBIR are as follows21. Advantages: i.Easier implementation, ii. Fast retrieval (user friendly), iii. Ease to web image search (surrounding text). Disadvantages: i.Manual description is impossible for a huge amount of database, ii. Manual description of image is not accurate, iii. Surrounding key may not be relevant to the image to be retrieved, iv. Polysemy problem. Figure-1 Text-Based Image Retrieval Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 4(ISC-2014), 19-23 (2015) Res. J. Recent. Sci. International Science Congress Association 21 Content-Based Image Retrieval: CBIR is a technique for retrieving images on the basis of extracting and indexing of automatically derived low-level features of images such as: color, texture, and shape22. CBIR is also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR)23. CBIR uses the visual content to search images from large scale image database according to the user’s interest, has been an active and fast advancing research area since 1990’s. In a typical CBIR systems, the visual content of images in the database are extracted and described by multi-dimensional feature vectors24. The color content of an image is the most widely used feature for CBIR, while texture and shape feature are also used to a lesser degree. A single feature is not enough to discriminate among a homogenous group of images. In such cases, either pairs of these features or all of them are used for the purpose of indexing and retrieval. Similarity matching, through matrices called similarity measures determine the degree of relevance of an image in a collection to a query. This is the key component of CBIR system because finding a set of images similar to the image, the user had in mind is its primary goal25. A general and simplified model of a query-by-example (QBE) CBIR system is shown in figure-2. IBM’s Query by Image Content (QBIC) described first by Flinkner et al.,26, Virage’s VIR Image engine 27, and Excalibur’s Image Retrieval Ware are several CBIR systems that are in use commercially. To retrieve images on the web the several CBIR systems like WebSEEK28, Informedia, and Photobook are preferred among others15. Idris and Panchanathan29 discussed several methods for image indexing and Content-Based image retrieval. Advantages with CBIR are as follows: i. The feature employed by the systems include color, texture, shape, and spatial are automatically indexed, ii. Similarities of images are based on the features of these images, iii. Semantic retrieval Figure-2 A general model of CBIR system25 Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 4(ISC-2014), 19-23 (2015) Res. J. Recent. Sci. International Science Congress Association 22 The following is a brief description of dominant method of CBIR: Color: Image retrieval based on the color similarity is achieved by computing a color histogram from each image that identifies the proportion of pixels with in an image holding specific value. Many attempts are taken to segment color proportion by region and by spatial relationship among several color region30-31. Texture: Texture contains important information about the structural arrangement of surfaces and their relationship to the surrounding environment. Texture provide useful information of the surfaces, about their structures and the relationship with the surrounding. Texture is a difficult concept to represent. The identification of specific texture in an image is achieved primarily by modelling texture as a two-dimensional gray level variation. The relative brightness of pairs of pixels is computed such that degree of contrast, regularity, and directionality may be estimated. Ma and Manjanath3,32 have extended work in this area through the development of a texture thesaurus that matches texture regions in image to words representing texture attributes. Shape: Shape does not refer to the shape of an image but to the shape of a particular region that is being sought out. Queries for shapes are generally achieved by selecting an example image provided by the system or by having the user sketch a shape. The primary mechanism used for shape retrieval include identification of features such as lines, boundaries, aspect ratio, and circularity, and by identifying areas of change or stability via region growing and edge detection. Research in object recognition conducted by Forsythe et al.,33 has sought to develop techniques for modelling a class of objects and identifying, defining attributes and features for that class. Chang et al.,34 also utilize user’s relevance judgments to refine searches and to assign semantic keywords to an image that can be used by subsequent users to query the system. The technology for CBIR is still in its infancy. Hybrid Approach: A recent trend for image search is to fuse two basic modalities of the web image, i.e., textual context (usually represented by keywords) and the visual features for retrieval35. It is suggested, a joint use existing a textual context and visual features can provide a better retrieval results36. The simplest approach for this method is based on counting the frequency-of-occurrence of words for automatic indexing. This simple approach can be extended by giving more weights to the words which occur in the alt or src tag of the image or which can occur inside the head tag or any other important tags of HTML document. The second approach takes a different stand and treats images and texts as equivalent data. It attempts to discover the correlation between visual features and textual words on an unsupervised basis, by estimating the joint distribution of features and words and posing annotation as statistical interference in a graphical model. As a result, the pure combination of TBIR and CBIR approaches is not efficient for dealing with the problem of image retrieval on the Web. Conclusion As conclusion, this review article present a brief of image retrieval techniques. Many number of researchers have been focused on the techniques of image retrieval and each research work has its own way to retrieve the relevant image, contributions in the global retrieval systems, and limitations. This article attempts to deal with a brief of the most common and modern/ commercial image retrieval systems and techniques from early text based systems to content based retrieval. From the study of the past and present scenario of image retrieval systems it can be concluded that many researchers has been done satisfactory work but still a long way to go to overcome the flaws of the image retrieval systems. References 1.Pal M.S. and Garg S.K., Image Retrieval: A Literature Review, IJARCET, 2(6), (2013)2.Rosenfeld A., Picture Processing by Computer, ACM Computing Surveys (CSUR), 1(3), 147-176, (1969)3.Tamura H. and Mori S., A Data Management System for Manipulating Large Images, In Proceeding of Workshop on Picture Data Description and Management, 45-54, (1977)4.Lawrence S. and Giles L., Accessibility of information on the web, Nature, 400, 107-109, (1999)5.Jain R., Workshop Report: NSF workshop on visual information management systems, Storage and Retrieval for Image and Video Databases (Niblack, W. R., and Jain, R. C., eds), Proceeding SPIE 1908, 198-218, (1993)6.Jorgensen C., Image Retrieval Theory and Research, (2003)7.Riad A.M., Atwan A., and Abd El-Ghany S., Image Based Information Retrieval Using Mobile Agent, Egyptian Informatics Journal, 10(1), (2009)8.Vijayarajan V., Khalid M. and Chandramouli P.V.S.S.R., A Review: From Keywords Based image Retrieval to Technology Based Image Retrieval, International Journal of Review in Computing, 12, (2012)9.Chang S.K. and Kunii T., Pictorial data-base applications, IEEE Comput., 14 (11), 13-21, (1981)10.Enser P., Pictorial information retrieval, Journal of Documentation, 51(2), 126-170, (1995)11.Gavade J.D., Chhajed G.J. and Upadhyay K.A., Review on Image Retrieval System, International Journal ofAdvance Research in Electrical, Electronics, and Instrumentation Engineering, 2(4), April, (2013) Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 4(ISC-2014), 19-23 (2015) Res. J. Recent. Sci. International Science Congress Association 23 12.Alaa M. Riad, Hamdy K. Elminir and SamehAbd-Elghany, A Literature Review of Image Retrieval based on Semantic Concept, International Journal of Computer Applications (0975 – 8887), 40(11), (2012)13.Su J., Wang B., Yeh H. and Tseng V.S., Ontology-Based Semantic Web Image Retrieval by Utilizing Textual and Visual Annotations, Web Intelligence/IAT Workshops, 425-428, (2009)14.Popescu A., Grefenstette G. and Moëllic P., Improving Image Retrieval Using Semantic Resources, Advances in Semantic Media Adaptation and Personalization, 75-96, (2008)15.Goodrum A. and Spink A., Visual Information Seeking: A Study of Image Queries on the World Wide Web, Proceedings the 1999 Annual Meeting of the American Society for Information Science., October 31-Nov 4, (1999), Washington, DC. 16.Shatford S., Analyzing the subject of a picture: a theoretical approach, Cataloging and Classification Quarterly, 6(3), 39-62, (1986)17.Shatford-Layne S., Some Issues in the Indexing of Images, Journal of the American Society of Information Science, 45(8), 583-588, (1994)18.Turner J., Representing and Accessing Information in the Stockshot Database at the National Film Board of Canada, The CanadianJournal of Information Science,15, 1-22, (1990)19.Seloff G.A., Automated Access to the NASA-JSC Image Archive, Library Trends, 38(4), 682-696, (1990) 20.Markey, K., Access to iconographical research collections. Library Trends, 37(2), 154-174, (1988)21.Ahmed G.F. and Barskar R., A Study on Different Image Retrieval Techniques in Image Processing, IJSCE, ISSN: 2231-2307, 1(4) (2011)22.Smeulders A.W.M., Worring M., Santini S., Gupta A. and Jain R., Content-Based Image Retrieval at the End of the Early Years”, IEEE Trans, Pattern Anal, Machine Intel, 22(12), 1349-1380, (2000)23.Guoyong D., Yang J. and Yong Y., Content-Based Image Retrieval Research, International Conference on Physics, Science and Technology (ICPST), (2011)24.MacArthur S.D., Brodly C.E. and Kak A.C., Interactive CBIR Using Relevance Feedback, Computer Version and Image Understanding, 55-75, (2002)25.Singh B. and Ahmed W., Content-Based Image Retrieval: A Review Paper, IJCSMC, 3(5), 769-775, (2014)26.Flickner M., Sawhney H., Niblack W., Ashley J., Huang Q., Dom B., Gorkani M., Hafner J., Lee D., Petkovic D., Steele D. and Yanker P., Query by Image and Video Content: The QBIC System, IEEE Comput., 28(9), 23-32, (1995)27.Gupta A., The Virage Image Search Engine: An Open Framework for Image Management in Storage and Retrieval for Imageand Video Databases IV, Proceeding SPIE 2670, 76-87, (1996)28.Smith J.R. and Chang S.F., VisualSEEk: A Fully Automated Content-Based Image Query System, In Proceedings of the fourth ACM International Conference on Multimedia, 87-98, (1997)29.Idris F and Panchanathan S., Review of Image and Video Indexing Techniques, Journal of Visual Communication and Image Representation, 8(2), 146-166, (1997a)30.Stricker M. and Orengo M., Similarity of Color Images, in Storage and Retrieval for Image and Video Databases III (Niblack, W.R., and Jain, R. C., eds), Proceeding SPIE 2420, 381-392, (1995)31.Carson C., Belongie S., Greenspan H. and Malik J., Region-Based Image Querying, in Proceedings of the 1997IEEE Conference on Computer Vision and Pattern Recognition (CVPR’97), IEEE Computer Society, San Juan, Puerto Rico, 42-51, (1997)32.Ma W. and Manjanath B., Netra: A Toolbox for Navigating Large Image Databases, Proceedings of IEEE International Conference on Image Processing (ICIP97), , 568-571, (1998)33.Forsythe D., Finding Pictures of Objects in Large Collections of Images, in Digital Image Access and Retrieval, 1996 Clinic on Library Applications of Data Processing, (Heidorn, P. and Sandore, B. eds.), 118-139, (1997)34.Chang E., RIME: A Replicated Image Detector for the WWW, in Multimedia Storage and Archiving Systems III, (Kuo, C. et al, eds.), Proceeding SPIE 3527, 58-67, (1998)35.He R., Xiong N., Yang L.T. and Park J.H., Using Multi-Model Semantic Association Rules to Fuse Keywords and Visual Features Automatically for Web Image Retrieval, Information Fusion,12(3), (2010)36.Hou, J., Zhang, D., Chen, Z., Jiang, L., Zhang, H., and Qin, X., Web Image Search by Automatic Image Annotation and Translation, Presented at the 17thInternational Conference on Systems, Signals, and Image Processing, (2010)