Page 2 | Best Open Source Java Linguistics Software

Java Linguistics Software

Linguistics Java Clear Filters

Browse free open source Java Linguistics Software and projects below. Use the toggles on the left to filter open source Java Linguistics Software by OS, license, language, programming language, and project status.

Create engaging surveys on your tablet or computer with ease.
Choose any of our carefully designed themes, or easily customize colors, fonts, and more to reflect your brand's true look and feel.

Create great-looking surveys, forms, polls, voting, questionnaires, NPS, customer satisfaction, customer experience, employee satisfaction surveys... on your computer or tablet, customize the look of your survey however you like, & display collected data with eye-catching and insightful graphics.

Learn More
Managed File Transfer Software
Products to help you get data where it needs to go—securely and efficiently.

For too many businesses, complex file transfer needs make it difficult to create, manage and support data flows to and from internal and external systems. Progress® MOVEit® empowers enterprises to take control of their file transfer workflows with solutions that help secure, simplify and centralize data exchanges throughout the organization.

Learn More
1

BANAL

BANAL - Banal And Not A Language. A prototyping notation compatible with Java and C# (via the largest possible common footprint between the two).

Downloads: 0 This Week

Last Update: 2013-04-12
See Project
2

BANNER Named Entity Recognition System

BANNER is a named entity recognition system intended primarily for biomedical text. It uses conditional random fields as the primary recognition engine and includes a wide survey of the best techniques described in recent literature.

Downloads: 0 This Week

Last Update: 2015-07-30
See Project
3

Bermuda Text-to-Speech

This project includes basic NLP and DSP techniques for Text-to-Speech

See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.

Downloads: 0 This Week

Last Update: 2014-03-24
See Project
4

BioContext

Software for extraction of biomedical information from literature

Downloads: 0 This Week

Last Update: 2012-02-12
See Project
RouteGenie NEMT software
Modern software for non-emergency medical transportation providers, built to improve scheduling, billing, routing, and dispatching processes.

RouteGenie NEMT software is a modern system built to automate all non-emergency medical transportation processes including routing, scheduling, dispatching, and billing. It helps manage everyday challenges like vehicle breakdowns, traffic problems, cancelations, driver call-offs, will calls, no shows, add-on trips, on-demand trips, and more.

Learn More
5

BioEvent

This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the following paper: Ehsan Emadzadeh, Azadeh Nikfarjam, and Graciela Gonzalez. 2011. Double Layered Learning for Biological Event Extraction from Text. In Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task, Portland, Oregon, June. Association for Computational Linguistic

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
6

BioLemmatizer

Lemmatization tool for morphological analysis of biomedical literature

The BioLemmatizer is a domain-specific lemmatization tool for the morphological analysis of biomedical literature. It is tailored to the biological domain through integration of several published lexical resources related to molecular biology. It focuses on the inflectional morphology of English, including the plural form of nouns, the conjugations of verbs, and the comparative and superlative form of adjectives and adverbs. README: https://sourceforge.net/projects/biolemmatizer/files/ The BioLemmatizer 1.2 release adds an optional functionality to normalize British English spellings into American English spellings and then retrieve corresponding lemmas. If you use the BioLemmatizer to support academic research, please cite the following paper: Haibin Liu, Tom Christiansen, William A Baumgartner Jr, and Karin Verspoor BioLemmatizer: a lemmatization tool for morphological processing of biomedical text Journal of Biomedical Semantics 2012, 3:3.

Downloads: 0 This Week

Last Update: 2013-10-23
See Project
7

Board Game Language

Board Game Language (BGL, pronounced "bagel") is a natural language syntax programming language for first-time programmers. It uses board games as a metaphor for programming concepts, with the goal of teaching users the foundations of programming.

Downloads: 0 This Week

Last Update: 2014-06-23
See Project
8

BuckTagger

User-assisted tool for Arabic stem entry to Buckwalter Morpho Analyzer

Using rules written in a Drools decision table, BuckTagger determines the correct Buckwalter Tag based on morphological properties of the input, automatically extracted or given by the user. At the moment, BuckTagger is not complete; it can only handle input that is: - Uninflected - In lexical form, i.e., no clitics or affixes. - A Perfect or Imperfect Verb - Preferably the first and before-last letters are diacritized/vocalized. The interface is in Arabic. See the README for more details. There is much room for development. Feel free to comment.

Downloads: 0 This Week

Last Update: 2014-05-22
See Project
9

CHALICE

Connecting Historical Authorities with Links, Contexts and Entities. CHALICE is a historic placename gazetteer for the UK, published as Linked Data and linked to other widely-used sources of placename reference information on the semantic web.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
The ultimate digital workspace.
Axero Intranet is an award-winning intranet and employee experience platform.

Hundreds of companies and millions of employees use Axero’s intranet software to communicate, collaborate, manage tasks and events, organize content, and develop their company culture.

Learn More
10

CLEiM

Cross Lingual Education in Medicine

CLEiM (Cross Lingual Education in Medicine) is an opensource version of an Intelligent System which extract concepts from medical texts and provides qualified information. It integrates information from various sources. This system has been developed by the Intelligent System Group GSI (http://www.esi.uem.es/gsi/) at UEM University. We do NER (Named Entity Recognition) based on GATE platform. The installation is simple, you can use it as a Web application. It has been tested under apache-tomcat. The original system has been successfully used to carry out active learning activities with medical students. However, it could be interesting in much more knowledge fields.

Downloads: 0 This Week

Last Update: 2014-09-10
See Project
11

Chaski

Distributed phrase-based machine translation training tool based on Hadoop.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
12

CoSyne Integrated Prototype

Multilingual Content Synchronization with Wikis: CoSyne is a Research and Technological Development project co-funded by the European Union. Details: http://cosyne.eu

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
13

Communication Supporting System

Downloads: 0 This Week

Last Update: 2015-03-26
See Project
14

Communication Supporting System

Downloads: 0 This Week

Last Update: 2013-05-29
See Project
15

ConTextKit

ConTextKit is a Java-based implementation of Wendy Chapman's ConText algorithm for annotating the context of medical documents, specifically the negation, temporality, and experiencer.

Downloads: 0 This Week

Last Update: 2014-06-24
See Project
16

CorpSe

CORPSE (CORPus SEarch) is a powerful search engine written in Java. The aim is to provide an efficient implementation of a word level inverted index search with various cool functions that can be used on very large corpora.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
17

Corpus Toolkit

A text management tool for linguistic purposes...

Downloads: 0 This Week

Last Update: 2017-11-23
See Project
18

Cunei Machine Translation Platform

Cunei is a data-driven machine translation system that builds dynamic, statistical models based on instances of known translations found in a corpus.

1 Review

Downloads: 0 This Week

Last Update: 2013-06-05
See Project
19

DArtikel!

Learn the articles of German words.

Learn words in german that you know at your own pace. With this system you can add the words you knew in a day and then do exercises with them. Written by: Jovanny Pablo Cruz Gómez. Computer Engineering Student. IPN, ESIME Culhuacan, Mexico City.

Downloads: 0 This Week

Last Update: 2013-11-07
See Project
20

DCTFinder

Extract title and creation time from web page.

Web pages do not offer reliable metadata concerning their creation date and time. However, getting the document creation time is a necessary step for allowing to apply temporal normalization systems to web pages. DCTFinder is a system that parses a web page and extracts from its content the title and the creation date of this web page. DCTFinder combines heuristic title detection, supervised learning with Conditional Random Fields (CRFs) for document date extraction, and rule-based creation time recognition. DCTFinder is released under CeCILL free software license agreement. The system is described in the following paper (see 'Files' section): Xavier Tannier. "Extracting News Web Page Creation Time with DCTFinder". Proceedings of the 9th Language Resources and Evaluation Conference. Reykjavik, Iceland.

Downloads: 0 This Week

Last Update: 2016-10-21
See Project
21

DawNLITE

DawNLITE is a Natural-Language-based Image Transmoding Engine. The software transforms an image to a video as recorded by a virtual camera panning and zooming over the image, following a natural language text description of the image.

Downloads: 0 This Week

Last Update: 2013-04-18
See Project
22

Dendrarium

System do pielęgnacji składnikowych drzew składniowych

Dendrarium służy do wybierania i weryfikacji składnikowych drzew składniowych generowanych przez parser Świgra. System jest użytkowany w Instytucie Podstaw Informatyki PAN do tworzenia banku drzew składniowych dla języka polskiego Składnica.

Downloads: 0 This Week

Last Update: 2014-02-18
See Project
23

Discriminative Language Editor

Discriminative language editor based on ontologies

Text editor in Java that is able to detect discriminative expressions while the user is typing. When the internal ontology-based analyzer detects a potential discriminative expression the user is advised by underscoring the related words in the text. A descriptive message about the issue is also shown to the user when the cursor is placed over the potential discriminative expression.

Downloads: 0 This Week

Last Update: 2016-10-30
See Project
24

Drug Extraction

Drug name extraction

Drug name recognition and normalisation/grounding to DrugBank ids and standard names. Package provides 2 taggers: 1. DrugTagger - CRF-based with DrugBank presence feature (see feature set for details). 2. DrugnameGazetteer - gazetteer/dictionary-based. Dictionary created from DrugBank.ca database. Both taggers include grounding/normalisation to DrugBank ids and standard names. Feature set: Word, Word-1, Word+1, Word-1_Word, Word_Word+1, DrugBankPresence, POS DrugBankPresence feature indicates the presence of the drug name in the DrugBank. Using CONLL-Evaluation: processed 32065 tokens with 3656 phrases; found: 3251 phrases; correct: 2786. accuracy: 95.25%; precision: 85.70%; recall: 76.20%; FB1: 80.67 Using GATE Corpus Benchmark: Strict: P: 0.65 R: 0.73 F1: 0.69 Lenient: P: 0.74 R: 0.84 F1: 0.78 The details of how to reproduce evaluation, see README. To use standalone version for tagging download DrugExtractionStandalone.tar.gz from Files.

Downloads: 0 This Week

Last Update: 2015-06-12
See Project
25

Dutch sentiment analysis engine

Een module om de sentiment van een stuk Nederlandse tekst to bepalen

This application was developed by Incentro to satisfy requests by clients for a sentiment analyser for the Dutch language. It is currently in it's alpha stage and we expect to have a beta release by November 2012. If you would like to help with the development or testing of this product please contact us at +31[0]15 76 40 750 - of info {at} incentro.com. Deze applicatie is ontwikkeld door Incentro om te voldoen aan klantaanvragen voor een sentimentanalyse module voor de Nederlandse taal. Momenteel is de module in alpha versie beschikbaar en een beta versie wordt verwacht in november 2012. Als u ons wilt helpen bij het ontwikkelen of testen van deze module, neem dan contact op met Incentro via +31[0]15 76 40 750 - of info {at} incentro.com.

1 Review

Downloads: 0 This Week

Last Update: 2016-10-06
See Project