Open Source Text Processing Software - Page 4

Text Processing Software

View 91 business solutions
  • MicroStation by Bentley Systems is the trusted computer-aided design (CAD) software built specifically for infrastructure design. Icon
    MicroStation by Bentley Systems is the trusted computer-aided design (CAD) software built specifically for infrastructure design.

    Microstation enables architects, engineers, and designers to create precise 2D and 3D drawings that bring complex projects to life.

    MicroStation is the only computer-aided design software for infrastructure design, helping architects and engineers like you bring their vision to life, present their designs to their clients, and deliver their projects to the community.
    Learn More
  • Collect! is a highly configurable debt collection software Icon
    Collect! is a highly configurable debt collection software

    Everything that matters to debt collection, all in one solution.

    The flexible & scalable debt collection software built to automate your workflow. From startup to enterprise, we have the solution for you.
    Learn More
  • 1
    ChordSmith

    ChordSmith

    Chordpro editor that can display, transpose and print song sheets.

    ChordSmith is a chordpro editor that can display, transpose and print song sheets containing chords and lyrics. It can also edit and convert song sheet formats (including Harmonica tabs) between chordpro format (chords in square brackets in line with lyrics) and two-line format (chords above lyrics). You can find many free sources of song sheets in both formats on the Internet. More information at https://chordsmith.sourceforge.io/ Here are just a few of ChordSmith's many features:
    Downloads: 36 This Week
    Last Update:
    See Project
  • 2
    GATE
    NOTE THAT THE SOURCE CODE AND ISSUE TRACKER HAVE NOW MOVED TO GITHUB. FIND US AT https://github.com/GateNLP/ GATE (General Architecture for Text Engineering) is an architecture, framework and development environment for developing, evaluating and embedding Human Language Technology. See http://gate.ac.uk for full details.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    The Writers Forge is a fiction authoring suite, an IDE for writers. The tool suite will provide integrated support for writing screenplays and prose, and developing plot and character. The backend will support many target formats, including XML and PDF.
    Leader badge
    Downloads: 33 This Week
    Last Update:
    See Project
  • 4
    IniTranslator is a Windows tool for developers and users to simplify the translation and localization of ini style language files in a manner similar to how poEdit works. IniTranslator can also load and save other formats through its plugin interface.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Award-Winning Medical Office Software Designed for Your Specialty Icon
    Award-Winning Medical Office Software Designed for Your Specialty

    Succeed and scale your practice with cloud-based, data-backed, AI-powered healthcare software.

    RXNT is an ambulatory healthcare technology pioneer that empowers medical practices and healthcare organizations to succeed and scale through innovative, data-backed, AI-powered software.
    Learn More
  • 5

    Bulgarian language support

    Spell check, grammar check and hyphenation for Bulgarian language

    The goal of this project is to provide spell check, grammar check and hyphenation for Bulgarian language for Open Source products such as OpenOffice.org, LibreOffice, TeX, aspell, ispell, hunspell etc.
    Leader badge
    Downloads: 28 This Week
    Last Update:
    See Project
  • 6
    Tarjamento de Dados Pessoais e Sigilosos

    Tarjamento de Dados Pessoais e Sigilosos

    Ferramenta de Tarjamento de Dados Pessoais e Sigilosos

    TarjaPDF v2.0 Beta — Ferramenta de Tarjamento de Dados Pessoais e Sigilosos Proteja dados sensíveis em PDFs com segurança irreversível. Interface moderna com dark mode, marcação manual (texto, linha e área livre), detecção automática de CPF, RG, e-mail, telefone, nomes próprios e endereços. Escaneamento inteligente com análise preditiva: destaca dados pessoais para revisão antes de tarjar. Detecção de nomes via heurística e base oficial, com dicionário customizável. Relatório de conformidade LGPD após cada operação. Security by Design: salva exclusivamente como PDF-imagem, impossibilitando recuperação dos dados tarjados. Novo na versão 2.0: marcação por área, buscar e tarjar, scan preditivo com clique para tarjar blocos, menu de contexto e gerenciador de nomes. ⚠️ Alguns antivírus podem gerar falso positivo devido ao empacotamento com PyInstaller. O software é seguro. Ao instalar, adicione exceção no antivírus ou desabilite temporariamente a proteção em tempo real.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 7
    natural

    natural

    General natural language facilities for node

    "Natural" is a general natural language facility for nodejs. It offers a broad range of functionalities for natural language processing. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported. It’s still in the early stages, so we’re very interested in bug reports, contributions and the like. Note that many algorithms from Rob Ellis’s node-nltools are being merged into this project and will be maintained from here onward. While most of the algorithms are English-specific, contributors have implemented support for other languages. Russian stemming has been added and Spanish stemming has been added, as well. Stemming and tokenizing in more languages have been added. If you’re just looking to use natural without your own node application, you can install via NPM.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    The Punjabi Computing Resource Centre holds resources (specifically articles, programs and fonts) to support the use of the Punjabi language using Unicode Gurmukhi. It also hosts a forum for language debate and technical support.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 9
    xmltoman and xmlmantohtml are two small scripts to convert xml to man pages in groff format or html. It features the usual man page items such a "description", "options", "see also" etc.
    Leader badge
    Downloads: 25 This Week
    Last Update:
    See Project
  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
    Learn More
  • 10
    Vim provides a rich set of tools which makes generating latex easy, pain-free and quite pleasurable. This web-site aims at bringing together the rich set of tools the vim community has produced over the years into a central repository
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    CONVERTCP

    CONVERTCP

    Text File Codepage Converter for the Windows command line

    This command line utility is a codepage converter to be used to change the character encoding of text. It fully supports charsets such as ANSI code pages, UTF-8, UTF-16 LE/BE, UTF-32 LE/BE, and EBCDIC. It's designed to convert big text files, too. It runs on Windows XP onwards (tested on XP, Windows 7, Windows 8.1, Windows 10, and Windows 11). The "readme.txt" file and the Wiki gives you some more information. You'll find the compiled tool for 32 bit (x86) and 64 bit (x64) Windows in the "bin" directory. The C source code is available in the "src" directory. Just click on the "Files" tab. Regardless if you have or don't have a SourceForge account - whenever you have questions about CONVERTCP or you want to give feedback then you are welcome to post it in the forum. Click on the "Discussion" tab.
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    Leader badge
    Downloads: 22 This Week
    Last Update:
    See Project
  • 13
    Regular Expression Editor (RegExpEditor)

    Regular Expression Editor (RegExpEditor)

    regex as a tool, not as a problem

    Regular Expressions (aka regex, regexp) made easy. This simple tool manipulates text with regular expressions. Highlighting of regular expression results. See the real power of regex! Use Scala to do manipulate your search results even more.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    ChemFormatter is add-in program for Microsoft Office. ChemFormatter automatically applies font styles in a chemical document.
    Leader badge
    Downloads: 21 This Week
    Last Update:
    See Project
  • 15
    OmegaT+ CAT Tools
    A translation tools suite for Computer-Aided Translation / Computer-Assisted Translation (CAT). A translation processor with translation memory, machine translation and project support, bitext aligner/converter, TMX validator, and others.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    ScreenTranslate

    ScreenTranslate

    Translate any text on your Mac screen — capture or select,instantly.

    ScreenTranslate lets you translate any text on your Mac screen without switching tabs or copy-pasting. Screen Capture Translation: Press Cmd+Shift+T, drag over any text on screen, and get an instant translation popup. Works with images, PDFs, and subtitles using OCR (Apple Vision). Text Selection Translation: Select text in any app and press Cmd+Option+Z to translate directly. No OCR needed. - Free and open-source (GPL-3.0) - On-device translation using Apple Translation - Works offline with downloaded language packs - 20 languages with auto-detect - Optional cloud engines (DeepL, Google, Azure) with your own API key - Auto-copy to clipboard - Translation history with search - Lightweight menu bar app - Apple Silicon and Intel Mac supported - macOS 15 (Sequoia) or later required
    Downloads: 20 This Week
    Last Update:
    See Project
  • 17
    SciTECO

    SciTECO

    Advanced TECO dialect and interactive screen editor based on Scintilla

    SciTECO is an interactive TECO dialect, similar to Video TECO. It also adds features from classic TECO-11, as well as unique new ideas. Project development takes place here: https://git.fmsbw.de/sciteco The download archive is mirrored at Sourceforge, but for nightly builds check out: https://sciteco.fmsbw.de/downloads/nightly/
    Downloads: 11 This Week
    Last Update:
    See Project
  • 18
    The XSD editor is a cross-platform XML editor. Although it can be used to edit any type of XML file, the editor is specifically designed to allow easy creation, editing, and validation of XML Schema (XSD) files.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 19

    MindRaider

    MindRaider is a personal notebook and outliner.

    MindRaider is a personal notebook and outliner. Where do you keep private remarks like ideas, plans, gift tips and howtos? Loads of documents and remarks spread around the file system? Can you find a remark when you need it? No? Try MindRaider!
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    RText is a customizable programmer's text editor written in Java. Some of its features include: syntax highlighting, editing multiple documents at once, printing and print preview, find/replace/find in files dialogs, undo/redo, and online help.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Tomoe is a handwriting character recognition engine.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 22
    Diff-ext is an extension for filemanagers such as Windows Explorer and Nautilus that allows to launch diff/merge tools on selected files.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    Library for automatic charset detection of a given text or file. Input buffer will be analysed to guess used encoding. The result (charset name or code page id) can be used as control parameter for charset conversation. Make your programs Unicode aware!
    Downloads: 9 This Week
    Last Update:
    See Project
  • 24

    Ghawwas_V4

    An open source system for Arabic corpora processing

    Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character encoding g. Accept TXT, DOC, DOCX, RTF and HTML formats h. Export the processing results in CSV file format
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    ATPad is a simple Notepad replacement. Tabbed environment, customizable editor, lines numbering. Keeping last sessions, bookmarks. Reloading documents "on-demand", outer changes tracking, sending documents as attachments and portability.
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB