स्वचालित छवि एनोटेशन: Difference between revisions

Latest revision as of 19:15, 22 August 2023

DenseCap डेंस कैप्शनिंग सॉफ़्टवेयर का आउटपुट, हाथी पर सवार एक आदमी की तस्वीर का विश्लेषण करता है

स्वचालित छवि एनोटेशन (जिसे स्वचालित छवि टैगिंग या भाषाई अनुक्रमण के रूप में भी जाना जाता है) वह प्रक्रिया है जिसके द्वारा एक कंप्यूटर प्रणाली स्वचालित रूप से एक डिजिटल छवि को कैप्शनिंग या कीवर्ड के रूप में मेटा डेटा प्रदान करता है। कंप्यूटर विज़न विधियों के इस एप्लीकेशन का उपयोग किसी डेटाबेस से महत्त्व की छवियों को व्यवस्थित करने और उनका पता लगाने के लिए छवि पुनर्प्राप्ति प्रणालियों में किया जाता है।

इस पद्धति को एक प्रकार के बहु-वर्ग छवि पहचान वर्गीकरण के रूप में माना जा सकता है जिसमें शब्दावली आकार जितनी बड़ी संख्या में कक्षाएं होती हैं। सामान्यतः, निकाले गए फ़ीचर वेक्टर और प्रशिक्षण एनोटेशन शब्दों के रूप में छवि विश्लेषण का उपयोग मशीन लर्निंग विधियों द्वारा नई छवियों पर एनोटेशन को स्वचालित रूप से प्रायुक्त करने का प्रयास करने के लिए किया जाता है। पहले विधियों ने फ़ीचर (कंप्यूटर विज़न) और प्रशिक्षण एनोटेशन के मध्य सहसंबंधों को सीखा, फिर मशीनी अनुवाद का उपयोग करके विधियों का विकास किया गया जिससे पाठ्य शब्दावली को 'विज़ुअल शब्दावली', या क्लस्टर क्षेत्रों के साथ अनुवाद करने की प्रयास की जा सके जिन्हें ब्लॉब्स के रूप में जाना जाता है। इन प्रयासों के बाद के कार्यों में वर्गीकरण पद्धति, प्रासंगिकता मॉडल इत्यादि सम्मिलित हैं।

स्वचालित छवि एनोटेशन विरुद्ध सामग्री-आधारित छवि पुनर्प्राप्ति (सीबीआईआर) के लाभ यह हैं कि उपयोगकर्ता द्वारा क्वेरीज़ को अधिक स्वाभाविक रूप से निर्दिष्ट किया जा सकता है।^[1] सामान्यतः (वर्तमान में) सीबीआईआर को उपयोगकर्ताओं को रंग और बनावट (दृश्य कला) जैसी छवि अवधारणाओं के आधार पर खोज करने या उदाहरण क्वेरी खोजने की आवश्यकता होती है। उदाहरण छवियों में कुछ छवि विशेषताएं उस अवधारणा को ओवरराइड कर सकती हैं जिस पर उपयोगकर्ता वास्तव में ध्यान केंद्रित कर रहा है। छवि पुनर्प्राप्ति के पारंपरिक विधियाँ, जैसे कि लाइब्रेरी द्वारा उपयोग किए जाने वाले, मैन्युअल रूप से एनोटेटेड छवियों पर निर्भर हैं, जो अस्तित्व में बड़े और निरंतर बढ़ते छवि डेटाबेस को देखते हुए महंगा और समय लेने वाला है।

यह भी देखें

संदर्भ

↑ "संग्रहीत प्रति" (PDF). i.yz.yamagata-u.ac.jp. Archived from the original (PDF) on 8 August 2014. Retrieved 13 January 2022.

Datta, Ritendra; Dhiraj Joshi; Jia Li; James Z. Wang (2008). "Image Retrieval: Ideas, Influences, and Trends of the New Age". ACM Computing Surveys. 40 (2): 1–60. doi:10.1145/1348246.1348248. S2CID 7060187.
Nicolas Hervé; Nozha Boujemaa (2007). "Image annotation : which approach for realistic databases ?" (PDF). ACM International Conference on Image and Video Retrieval. Archived from the original (PDF) on 2011-05-20.
M Inoue (2004). "On the need for annotation-based image retrieval" (PDF). Workshop on Information Retrieval in Context. pp. 44–46. Archived from the original (PDF) on 2014-08-08.

अग्रिम पठन

Word co-occurrence model

Y Mori; H Takahashi & R Oka (1999). "Image-to-word transformation based on dividing and vector quantizing images with words.". Proceedings of the International Workshop on Multimedia Intelligent Storage and Retrieval Management. CiteSeerX 10.1.1.31.1704.

Annotation as machine translation

P Duygulu; K Barnard; N de Fretias & D Forsyth (2002). "Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary". Proceedings of the European Conference on Computer Vision. pp. 97–112. Archived from the original on 2005-03-05.

Statistical models

J Li & J Z Wang (2006). "Real-time Computerized Annotation of Pictures". Proc. ACM Multimedia. pp. 911–920.

J Z Wang & J Li (2002). "Learning-Based Linguistic Indexing of Pictures with 2-D MHMMs". Proc. ACM Multimedia. pp. 436–445.

Automatic linguistic indexing of pictures

J Li & J Z Wang (2008). "Real-time Computerized Annotation of Pictures". IEEE Transactions on Pattern Analysis and Machine Intelligence.

J Li & J Z Wang (2003). "Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach". IEEE Transactions on Pattern Analysis and Machine Intelligence. pp. 1075–1088.

Hierarchical Aspect Cluster Model

K Barnard; D A Forsyth (2001). "Learning the Semantics of Words and Pictures". Proceedings of International Conference on Computer Vision. pp. 408–415. Archived from the original on 2007-09-28.

Latent Dirichlet Allocation model

D Blei; A Ng & M Jordan (2003). "Latent Dirichlet allocation" (PDF). Journal of Machine Learning Research. pp. 3:993–1022. Archived from the original (PDF) on 2005-05-21.

Supervised multiclass labeling

G Carneiro; A B Chan; P Moreno & N Vasconcelos (2006). "Supervised Learning of Semantic Classes for Image Annotation and Retrieval" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. pp. 394–410.

Texture similarity

R W Picard & T P Minka (1995). "Vision Texture for Annotation". Multimedia Systems.

Support Vector Machines

C Cusano; G Ciocca & R Scettini (2004). Santini, Simone & Schettini, Raimondo (eds.). "Image Annotation Using SVM". Internet Imaging V. 5304: 330–338. Bibcode:2003SPIE.5304..330C. doi:10.1117/12.526746. S2CID 16246057.

Ensemble of Decision Trees and Random Subwindows

R Maree; P Geurts; J Piater & L Wehenkel (2005). "Random Subwindows for Robust Image Classification". Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. pp. 1:34–30.

Maximum Entropy

J Jeon; R Manmatha (2004). "Using Maximum Entropy for Automatic Image Annotation" (PDF). Int'l Conf on Image and Video Retrieval (CIVR 2004). pp. 24–32.

Relevance models

J Jeon; V Lavrenko & R Manmatha (2003). "Automatic image annotation and retrieval using cross-media relevance models" (PDF). Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 119–126.

Relevance models using continuous probability density functions

V Lavrenko; R Manmatha & J Jeon (2003). "A model for learning the semantics of pictures" (PDF). Proceedings of the 16th Conference on Advances in Neural Information Processing Systems NIPS.

Coherent Language Model

R Jin; J Y Chai; L Si (2004). "Effective Automatic Image Annotation via A Coherent Language Model and Active Learning" (PDF). Proceedings of MM'04.

Inference networks

D Metzler & R Manmatha (2004). "An inference network approach to image retrieval" (PDF). Proceedings of the International Conference on Image and Video Retrieval. pp. 42–50.

Multiple Bernoulli distribution

S Feng; R Manmatha & V Lavrenko (2004). "Multiple Bernoulli relevance models for image and video annotation" (PDF). IEEE Conference on Computer Vision and Pattern Recognition. pp. 1002–1009.

Multiple design alternatives

J Y Pan; H-J Yang; P Duygulu; C Faloutsos (2004). "Automatic Image Captioning" (PDF). Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME'04). Archived from the original (PDF) on 2004-12-09.

Image captioning

Quan Hoang Lam; Quang Duy Le; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen (2020). "UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning". Proceedings of the 2020 International Conference on Computational Collective Intelligence (ICCCI 2020). arXiv:2002.00175. doi:10.1007/978-3-030-63007-2_57.

Natural scene annotation

J Fan; Y Gao; H Luo; G Xu (2004). "Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation". Proceedings of the 27th annual international conference on Research and development in information retrieval. pp. 361–368.

Relevant low-level global filters

A Oliva & A Torralba (2001). "Modeling the shape of the scene: a holistic representation of the spatial envelope" (PDF). International Journal of Computer Vision. pp. 42:145–175.

Global image features and nonparametric density estimation

A Yavlinsky, E Schofield & S Rüger (2005). "Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation" (PDF). Int'l Conf on Image and Video Retrieval (CIVR, Singapore, Jul 2005). Archived from the original (PDF) on 2005-12-20.

Video semantics

N Vasconcelos & A Lippman (2001). "Statistical Models of Video Structure for Content Analysis and Characterization" (PDF). IEEE Transactions on Image Processing. pp. 1–17.

Ilaria Bartolini; Marco Patella & Corrado Romani (2010). "Shiatsu: Semantic-based Hierarchical Automatic Tagging of Videos by Segmentation Using Cuts". 3rd ACM International Multimedia Workshop on Automated Information Extraction in Media Production (AIEMPro10).

Image Annotation Refinement

Yohan Jin; Latifur Khan; Lei Wang & Mamoun Awad (2005). "Image annotations by combining multiple evidence & wordNet". 13th Annual ACM International Conference on Multimedia (MM 05). pp. 706–715.

Changhu Wang; Feng Jing; Lei Zhang & Hong-Jiang Zhang (2006). "Image annotation refinement using random walk with restarts". 14th Annual ACM International Conference on Multimedia (MM 06).

Changhu Wang; Feng Jing; Lei Zhang & Hong-Jiang Zhang (2007). "content-based image annotation refinement". IEEE Conference on Computer Vision and Pattern Recognition (CVPR 07). doi:10.1109/CVPR.2007.383221.

Ilaria Bartolini & Paolo Ciaccia (2007). "Imagination: Exploiting Link Analysis for Accurate Image Annotation". Springer Adaptive Multimedia Retrieval. doi:10.1007/978-3-540-79860-6_3.

Ilaria Bartolini & Paolo Ciaccia (2010). "Multi-dimensional Keyword-based Image Annotation and Search". 2nd ACM International Workshop on Keyword Search on Structured Data (KEYS 2010).

Automatic Image Annotation by Ensemble of Visual Descriptors

Emre Akbas & Fatos Y. Vural (2007). "Automatic Image Annotation by Ensemble of Visual Descriptors". Intl. Conf. on Computer Vision (CVPR) 2007, Workshop on Semantic Learning Applications in Multimedia. doi:10.1109/CVPR.2007.383484.

A New Baseline for Image Annotation

Ameesh Makadia and Vladimir Pavlovic and Sanjiv Kumar (2008). "A New Baseline for Image Annotation" (PDF). European Conference on Computer Vision (ECCV).

Simultaneous Image Classification and Annotation

Chong Wang and David Blei and Li Fei-Fei (2009). "Simultaneous Image Classification and Annotation" (PDF). Conf. on Computer Vision and Pattern Recognition (CVPR).

TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation

Matthieu Guillaumin and Thomas Mensink and Jakob Verbeek and Cordelia Schmid (2009). "TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation" (PDF). Intl. Conf. on Computer Vision (ICCV).

Image Annotation Using Metric Learning in Semantic Neighbourhoods

Yashaswi Verma & C. V. Jawahar (2012). "Image Annotation Using Metric Learning in Semantic Neighbourhoods" (PDF). European Conference on Computer Vision (ECCV). Archived from the original (PDF) on 2013-05-14. Retrieved 2014-02-26.

Automatic Image Annotation Using Deep Learning Representations

Venkatesh N. Murthy & Subhransu Maji and R. Manmatha (2015). "Automatic Image Annotation Using Deep Learning Representations" (PDF). International Conference on Multimedia (ICMR).

Holistic Image Annotation using Salient Regions and Background Image Information

Sarin, Supheakmungkol; Fahrmair, Michael; Wagner, Matthias & Kameyama, Wataru (2012). Leveraging Features from Background and Salient Regions for Automatic Image Annotation. Journal of Information Processing. Vol. 20. pp. 250–266.

Medical Image Annotation using bayesian networks and active learning

N. B. Marvasti & E. Yörük and B. Acar (2018). "Computer-Aided Medical Image Annotation: Preliminary Results With Liver Lesions in CT". IEEE Journal of Biomedical and Health Informatics.

[1] "संग्रहीत प्रति" (PDF). i.yz.yamagata-u.ac.jp. Archived from the original (PDF) on 8 August 2014. Retrieved 13 January 2022.

[1]

@@ Line 100: / Line 100: @@
 url=https://www.researchgate.net/publication/320935564| title = Computer-Aided Medical Image Annotation: Preliminary Results With Liver Lesions in CT| book-title= IEEE Journal of Biomedical and Health Informatics
 | year = 2018}}
-{{Computer vision}}[[Category: कृत्रिम बुद्धि के अनुप्रयोग]] [[Category: कंप्यूटर विज़न के अनुप्रयोग]]
+{{Computer vision}}
+[[Category:Collapse templates]]
-[[Category: Machine Translated Page]]
 [[Category:Created On 27/07/2023]]
-[[Category:Vigyan Ready]]
+[[Category:Machine Translated Page]]
+[[Category:Navigational boxes| ]]
+[[Category:Navigational boxes without horizontal lists]]
+[[Category:Pages with script errors]]
+[[Category:Sidebars with styles needing conversion]]
+[[Category:Template documentation pages|Documentation/doc]]
+[[Category:Templates Vigyan Ready]]
+[[Category:Templates generating microformats]]
+[[Category:Templates that are not mobile friendly]]
+[[Category:Templates using TemplateData]]
+[[Category:Wikipedia metatemplates]]
+[[Category:कंप्यूटर विज़न के अनुप्रयोग]]
+[[Category:कृत्रिम बुद्धि के अनुप्रयोग]]

Anonymous

Search

स्वचालित छवि एनोटेशन: Difference between revisions

Namespaces

More

Page actions

Latest revision as of 19:15, 22 August 2023

यह भी देखें

संदर्भ

अग्रिम पठन

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

स्वचालित छवि एनोटेशन: Difference between revisions

Latest revision as of 19:15, 22 August 2023

यह भी देखें

संदर्भ

अग्रिम पठन

Navigation

Wiki tools

Page tools

Other projects

Categories