Scopus Export 2000s

Turkish Word N-Gram Analyzing Algorithms For A Large Scale Turkish Corpus - Turco

Abstract

To calculate some statistical properties of a language, first you need to take some samples of that language. That sample is called a corpus. An unbalanced large scale Turkish text corpus (TurCo) having ∼362 MB capacity and more than 50 million words was prepared by using 12 different resources including web sites and novels in Turkish language. Different algorithms were tested to obtain the n-gram (1 ≤ n ≤ 5) values. Efficiencies of different algorithms have been examined by applying them onto the each piece of the corpus one by one. Only detailed results of the two algorithms created without using database tables are given, because all the other algorithms need to run more than one day which makes those tests inefficient.

Publication Date

1-1-2004

Publication Title

International Conference on Information Technology: Coding Computing, ITCC

Volume

Number of Pages

236-240

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/itcc.2004.1286638

Copyright Status

Unknown

Socpus ID

3042648806 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/3042648806

STARS Citation

Çebi, Yalçin and Dalkiliç, Gökhan, "Turkish Word N-Gram Analyzing Algorithms For A Large Scale Turkish Corpus - Turco" (2004). Scopus Export 2000s. 5709.
https://stars.library.ucf.edu/scopus2000/5709

This document is currently not available here.

COinS

Scopus Export 2000s

Turkish Word N-Gram Analyzing Algorithms For A Large Scale Turkish Corpus - Turco

Abstract

Publication Date

Publication Title

Volume

Number of Pages

Document Type

Personal Identifier

DOI Link

Copyright Status

Socpus ID

Source API URL

STARS Citation

Explore

Connect

Scopus Export 2000s

Turkish Word N-Gram Analyzing Algorithms For A Large Scale Turkish Corpus - Turco

Creator

Abstract

Publication Date

Publication Title

Volume

Number of Pages

Document Type

Personal Identifier

DOI Link

Copyright Status

Socpus ID

Source API URL

STARS Citation

Share

Explore

Connect