On the Differences between Traditional and Web-Corpora based on the Analysis of High-Frequency Nouns
Authors
Maria Khokhlova
Corresponding Author
Maria Khokhlova
Available Online June 2017.
- DOI
- 10.2991/ipc-16.2017.76How to use a DOI?
- Keywords
- text corpus, web corpus, frequency dictionary, nouns.
- Abstract
The paper gives a survey of corpora and analyzes a number of Russian nouns across the following corpora: ruTenTen (18.3 bln tokens) and Araneum Russicum Maximum (13.7 bln tokens). The research focuses on the discussion on these corpora, their comparison and the study of frequency properties for the high-frequency Russian nouns comparing them with data published in the Frequency Dictionary.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Maria Khokhlova PY - 2017/06 DA - 2017/06 TI - On the Differences between Traditional and Web-Corpora based on the Analysis of High-Frequency Nouns BT - Proceedings of the 45th International Philological Conference (IPC 2016) PB - Atlantis Press SP - 301 EP - 304 SN - 2352-5398 UR - https://doi.org/10.2991/ipc-16.2017.76 DO - 10.2991/ipc-16.2017.76 ID - Khokhlova2017/06 ER -