Open-AutoGLM is an open-source framework and model designed to empower autonomous mobile intelligent assistants by enabling AI agents to understand and interact with phone screens in a multimodal manner, blending vision and language capability to control real devices. It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off completion of multi-step tasks such as navigating apps, filling forms, and more. Unlike traditional automation scripts that depend on brittle heuristics, Open-AutoGLM uses pretrained large language and vision-language models to interpret visual context and natural language instructions, giving the agent robust adaptability across apps and interfaces.

Features

  • Multimodal phone screen understanding (vision + language)
  • Autonomous control of smartphone actions (tap, swipe, type)
  • Framework for scripting and deploying mobile AI agents
  • Integration with device automation layers like ADB
  • Example demos for real apps to quickly prototype agents
  • Open framework for research and custom workflows

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow Open-AutoGLM

Open-AutoGLM Web Site

Other Useful Business Software
Premier Construction Software Icon
Premier Construction Software

Premier is a global leader in financial construction ERP software.

Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Open-AutoGLM!

Additional Project Details

Operating Systems

Android, Apple iPhone, Linux, Mac, Windows

Programming Language

Python

Related Categories

Python AI Agent Frameworks

Registered

2026-01-20