
	<!DOCTYPE ArticleSet PUBLIC "-//NLM//DTD PubMed 2.0//EN" "http://www.ncbi.nlm.nih.gov:80/entrez/query/static/PubMed.dtd">
	<ArticleSet>

	<Article> 

	<Journal> 

	<PublisherName>International Science Community Association</PublisherName>

	<JournalTitle>Research Journal of Recent Sciences</JournalTitle> 

	<Issn>2277 - 2502</Issn>

	<Volume>3</Volume>

	<Issue>4</Issue>

	<PubDate PubStatus="ppublish"> 

	<Year>2014</Year> 

	<Month>April</Month> 

	<Day>2</Day> 

	</PubDate>

	</Journal>



	<ArticleTitle>The Analysis of Connected Components and Clustering in Segmentation of Persian Texts</ArticleTitle> 


	<FirstPage>71</FirstPage>

	<LastPage>77</LastPage>



	<ELocationID EIdType="pii"></ELocationID>

	<Language>EN</Language> 
	<AuthorList>

	
		<Author> 

		<FirstName>ZolfaghariZaferani</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Rashid</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Faculty of Education And Consulting, Roudehen Branch, Islamic Azad University, Roudehen, IRAN</Affiliation>

		</Author>
		<Author> 

		<FirstName>Haji</FirstName>

		<MiddleName> </MiddleName>

		<LastName>AliakbariNeda</LastName>

		<Suffix>2</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>HosseinRezaeiDolat</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Abadi</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Islamic Azad University, Persian GULf International Educational Branch, Khuzestan, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>AliReza</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Eghbali</LastName>

		<Suffix>2</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>HassanHeidarySoltan</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Abadi</LastName>

		<Suffix>3</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>EspananiHamid</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Reza</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Department of Biology, Faculty of Sciences, Payam Noor University of Iran, Employee social security, Isfahan, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Shirani</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Kobra</LastName>

		<Suffix>2</Suffix>

		<Affiliation> Department of Pharmacodynamy and Toxicology, School of Pharmacy, Mashhad University of Medical Sciences, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Khalilian</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Saadat</LastName>

		<Suffix>3</Suffix>

		<Affiliation> Department of Biochemistry, Faculty of Biological Science, Tarbiat Modares University, Tehran, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Sadeghi</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Leila</LastName>

		<Suffix>4</Suffix>

		<Affiliation> Department of Biology, Faculty of Sciences, Payam Noor University of Iran, Isfahan, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>YousefiBabadiVahid</FirstName>

		<MiddleName> </MiddleName>

		<LastName></LastName>

		<Suffix>5</Suffix>

		<Affiliation> Physiology Research Center, Isfahan Cardiovascular Research Center, Isfahan Cardiovascular Research Institute, Isfahan University of Medical</Affiliation>

		</Author>
		<Author> 

		<FirstName></FirstName>

		<MiddleName> </MiddleName>

		<LastName>AmraeaiEsmaiel</LastName>

		<Suffix>6</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Bansal</FirstName>

		<MiddleName> </MiddleName>

		<LastName>A.</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Department of ECE, G.L.A. University, Mathura, UP, INDIA </Affiliation>

		</Author>
		<Author> 

		<FirstName>Agarwal</FirstName>

		<MiddleName> </MiddleName>

		<LastName>R.</LastName>

		<Suffix>2</Suffix>

		<Affiliation> Department of EIED, Thapar University, Patiala, Punjab, INDIA </Affiliation>

		</Author>
		<Author> 

		<FirstName>Sharma</FirstName>

		<MiddleName> </MiddleName>

		<LastName>R.K.</LastName>

		<Suffix>3</Suffix>

		<Affiliation> School of Mathematics and Computer Applications, Thapar University, Patiala, Punjab, INDIA</Affiliation>

		</Author>
		<Author> 

		<FirstName>Paria</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Mohammadiha</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Dept. of Educational Administration, Faculty of Psychology and Social Science, Central Tehran Branch, Islamic Azad University, Tehran, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Mahdi</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Shariatmadari</LastName>

		<Suffix>2</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Fereshteh</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Kordestani</LastName>

		<Suffix>3</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Bakht</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Azam</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Department of Computer Science, Islamia College Peshawar, Peshawar, K.P, PAKISTAN  </Affiliation>

		</Author>
		<Author> 

		<FirstName>Qureshi</FirstName>

		<MiddleName> </MiddleName>

		<LastName>RashidJalal</LastName>

		<Suffix>2</Suffix>

		<Affiliation> Faculty of Computing, SZABIST, Dubai International Academic City, Dubai, U.A.E  </Affiliation>

		</Author>
		<Author> 

		<FirstName>Zahoor</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Jan</LastName>

		<Suffix>3</Suffix>

		<Affiliation> Department of Pathology, Lady Reading Hospital, Peshawar, K.P, PAKISTAN</Affiliation>

		</Author>
		<Author> 

		<FirstName>TajAli</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Khattak</LastName>

		<Suffix>4</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Jafar</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Omari</LastName>

		<Suffix>1</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Fazlzadeh</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Alireza</LastName>

		<Suffix>2</Suffix>

		<Affiliation>Department of Accounting, Tabriz Branch, Islamic Azad University, Tabriz, IRAN</Affiliation>

		</Author>
		<Author> 

		<FirstName>MohammadrezaM.</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Nahidi</LastName>

		<Suffix>3</Suffix>

		<Affiliation>Economics and Management Faculty, Tabriz University, Tabriz, IRAN</Affiliation>

		</Author>
		<Author> 

		<FirstName>Subahana</FirstName>

		<MiddleName> </MiddleName>

		<LastName>K.R.</LastName>

		<Suffix>1</Suffix>

		<Affiliation> CO Research and Green Technologies Centre, VIT University, Vellore, Tamil Nadu, INDIA </Affiliation>

		</Author>
		<Author> 

		<FirstName>Natarajan</FirstName>

		<MiddleName> </MiddleName>

		<LastName>R.</LastName>

		<Suffix>2</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Suman</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Bareth</LastName>

		<Suffix>3</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Ayub</FirstName>

		<MiddleName> </MiddleName>

		<LastName>G.</LastName>

		<Suffix>1</Suffix>

		<Affiliation>University of Swat, Khyber Pakhtonkhawa, PAKISTAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Rehman</FirstName>

		<MiddleName> </MiddleName>

		<LastName>N.U.</LastName>

		<Suffix>2</Suffix>

		<Affiliation> Economics Department, University of Peshawar, Khyber Pakhtonkhawa, PAKISTAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Iqbal</FirstName>

		<MiddleName> </MiddleName>

		<LastName>M.</LastName>

		<Suffix>3</Suffix>

		<Affiliation> Statistics Department, University of Peshawar, Khyber Pakhtonkhawa, PAKISTAN</Affiliation>

		</Author>
		<Author> 

		<FirstName>Zaman</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Q.</LastName>

		<Suffix>4</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Atif</FirstName>

		<MiddleName> </MiddleName>

		<LastName>M.</LastName>

		<Suffix>5</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Waqas</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Haider</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Computer Science Department COMSATS Institute of Information Technology Wah Cantt, PAKISTAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Hadia</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Bashir</LastName>

		<Suffix>2</Suffix>

		<Affiliation> Department of Computer Science University of Engineering and Technology Taxila, PAKISTAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Abida</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Sharif</LastName>

		<Suffix>3</Suffix>

		<Affiliation> Department of Electrical Engineering, the University of Lahore – Islamabad Campus, PAKISTAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Sharif</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Irfan</LastName>

		<Suffix>4</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Abdul</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Wahab</LastName>

		<Suffix>5</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Masoumifard</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Mohammad</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Department of Mechanical Engineering, Meshkinshahr Branch, Islamic Azad University, Meshkinshahr, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName></FirstName>

		<MiddleName> </MiddleName>

		<LastName>Norouzi</LastName>

		<Suffix>2</Suffix>

		<Affiliation> Young Researchers and Elite Club, Meshkinshahr Branch, Islamic Azad University, Meshkinshahr, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName></FirstName>

		<MiddleName> </MiddleName>

		<LastName>Rouhollah</LastName>

		<Suffix>3</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Nezamkhiavy</FirstName>

		<MiddleName> </MiddleName>

		<LastName>Khosro</LastName>

		<Suffix>4</Suffix>

		<Affiliation></Affiliation>

		</Author>
		<Author> 

		<FirstName>Askarpour</FirstName>

		<MiddleName> </MiddleName>

		<LastName>S.</LastName>

		<Suffix>1</Suffix>

		<Affiliation>Faculty Member of Technical and Vocation University, Kerman, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>SaberiAnari</FirstName>

		<MiddleName> </MiddleName>

		<LastName>M.</LastName>

		<Suffix>2</Suffix>

		<Affiliation> Faculty Member of Technical and Vocation University,Yazd, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Brumandnia</FirstName>

		<MiddleName> </MiddleName>

		<LastName>A.</LastName>

		<Suffix>3</Suffix>

		<Affiliation> Faculty member of Azad University, South Tehran Branch, Tehran, IRAN </Affiliation>

		</Author>
		<Author> 

		<FirstName>Javidi</FirstName>

		<MiddleName> </MiddleName>

		<LastName>M.M.</LastName>

		<Suffix>4</Suffix>

		<Affiliation> Faculty member of Shahid Bahonar University, Kerman, IRAN </Affiliation>

		</Author>

	<Author>

	<CollectiveName></CollectiveName>>

	</Author>

	</AuthorList>


	<PublicationType>Research Article</PublicationType>


	<History>  
	<PubDate PubStatus="received">
	<Year>2013</Year>
	<Month>12</Month>
	<Day>8</Day>
	</PubDate>
	<PubDate PubStatus="accepted">										
	<Year>2014</Year> 
	<Month>April</Month>									
	<Day>2</Day> 
	</PubDate>

	</History>
	<Abstract> According to the application development computer in human life and increasing use of structured electronic documents and advantages of using them, the need to convert paper documents into their electronic format and use of image processing has been increased. Among researches that have been done in this field, we can point to the identification of the words in texts that comprehensive researches have been done in different languages such as : English, Japanese and Chinese. However, in Persian and Arabic languages, due to the complexity of these languages such as letters interconnection and various forms for letters according to their position in word, it is still need to research in this field. Segmentation is one of the most important steps in letter recognition system that it accuracy and speed is very important. Segmentation of Persian texts is the hardest since the specification of this language. In this study, we try to present a fast and efficient algorithm than same algorithms for segmentation of Persian documents with that help of connected components and clustering, we pay to identification and grouping of text and image areas. The users of this project are typical and we can use it as preprocessing steps of Optical Character Recognition systems. This research has been done on a collection of 100 scanned images of Persian newspapers and magazines with 300 dpi clarification and also it shows the simulation results with accuracy rate of %92.3 and significant speed than other approaches such as Voronoi Diagram. </Abstract>

	<CopyrightInformation>Copyright@ International Science Community Association</CopyrightInformation>

	<ObjectList> 
	<Object Type="keyword">
	<Param Name="value"></Param>
	</Object>

	</ObjectList>	

	</Article>

	</ArticleSet>
	