KALIMAT a Multipurpose Arabic Corpus

We are pleased to announce the immediate availability of KALIMAT 1.0,

KALIMAT is an Arabic natural language resource that consists of:
1) 20,291 Arabic articles collected from the Omani newspaper Alwatan by (Abbas et al. 2011).
2) 20,291 Extractive Single-document system summaries.
3) 2,057 Extractive Multi-document system summaries.
4) 20,291 Named Entity Recognised articles.
5) 20,291 Part of Speech Tagged articles.
6) 20,291 Morphologically Analyse articles.

The data collection articles fall into six categories:
culture, economy, local-news, international-news, religion, and sports.

The process of creating KALIMAT was applied to the entire data collection (20,291 articles).

Features

  • corpus
  • natural language processing
  • resources
  • Arabic NLP
  • NLP
  • Arabic
  • Morphological Analyser
  • Named Entity Recognition
  • Part of Speech Tagger
  • Summarization
  • Morphosyntactic Analyser

Project Activity

See All Activity >

Follow KALIMAT Multipurpose Arabic Corpus

KALIMAT Multipurpose Arabic Corpus Web Site

Other Useful Business Software
Native Teams: Payments and Employment for International Teams Icon
Native Teams: Payments and Employment for International Teams

Expand Your Global Team in 85+ Countries

With Native Teams’ Employer of Record (EOR) service, you can compliantly hire in 85+ countries without setting up a legal entity. From dedicated employee support and localised benefits to tax optimisation, we help you build a global team that feels truly cared for.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of KALIMAT Multipurpose Arabic Corpus!

Additional Project Details

Registered

2013-02-19