robotframework-aitester is a Robot Framework library for autonomous, AI-driven
testing across web, API, and mobile flows. By combining the
Strands Agents SDK with native
Robot Framework library reuse, it lets testers specify what to test rather
than scripting every interaction detail by hand.
Supply a scenario, risk area, or numbered business flow for a target
application and the agent will plan or reuse a flow, execute it, adapt around
common transient blockers, capture evidence, and log results into Robot
Framework's built-in log.html / report.html.
For UI modes, AITester is intentionally session-reuse oriented: your suite opens the browser or mobile app with SeleniumLibrary or AppiumLibrary first, and AITester attaches to that existing session rather than provisioning a new one on its own.
- Direct single-mode execution uses a fast path:
Planner -> Web/API/Mobile Executor, while user-defined numberedtest_stepsskip planning and run straight in the target executor. - Numbered steps embedded directly in the objective are also detected and treated as the main flow, even if
test_stepsis not passed separately. - Supervisor orchestration remains available internally as a fallback path for unsupported or custom execution flows.
- Instrumented tool bridge records step status, duration, assertion details, and screenshot references, surfacing them in RF logs via the
AI Stepkeyword. - Browser analysis tools share a cached
get_page_snapshotview and derive interactive elements, page structure, form fields, links, text content, and console errors from that shared page state. - Mobile analysis tools now reuse a cached Appium source snapshot across screen-summary and source-inspection calls until the UI changes.
- Mobile runs now include higher-level Appium helpers for loading waits, picker selection, keyboard control, context switching, and back navigation.
- Web and mobile executors can add minimal recovery actions when the requested flow is blocked by cookie banners, consent modals, permission dialogs, tutorials, or similar transient UI interruptions. For web runs, cookie/consent banners are accepted by default unless the user explicitly says otherwise.
- Utility tools provide assertions, JSON parsing, timing, RF variable access, and optional AIVision screenshot analysis.
- RF built-in reporting with embedded screenshots, cached screenshot artifacts, and high-level step grouping when user-defined steps are provided.
Keywords documentation can be found here.
- For web runs, open the target browser with SeleniumLibrary before calling
Run AI TestorRun AI Exploration. - For mobile runs, start the Appium server, device or emulator, and open the application with AppiumLibrary before calling
Run AI Mobile TestorRun AI Exploration. - For API runs, load RequestsLibrary in the suite and provide
base_urlor an already-initialized session context when relevant. - Install only the extras you need for the target mode and provider.
- Set provider credentials through environment variables such as
OPENAI_API_KEY,GEMINI_API_KEY, orANTHROPIC_API_KEYwhen required. - If SeleniumLibrary, RequestsLibrary, or AppiumLibrary is imported with an alias, pass the corresponding
selenium_library,requests_library, orappium_libraryconstructor argument so AITester can attach to the existing session.
┌──────────────────────────────────────────────────────────────────┐
│ Layer 1: RF Keyword Layer │
│ .robot files → AITester keywords (Run AI Test, etc.) │
├──────────────────────────────────────────────────────────────────┤
│ Layer 2: Agent Orchestration Layer │
│ Direct Planner/Executor fast path or Supervisor fallback │
├──────────────────────────────────────────────────────────────────┤
│ Layer 3: Tool Bridge Layer │
│ Instrumented tools (Selenium/Requests/Appium + cached analysis) │
├──────────────────────────────────────────────────────────────────┤
│ Layer 4: AI Provider Layer │
│ Multi-provider GenAI (OpenAI, Ollama, Gemini, Anthropic, etc) │
├──────────────────────────────────────────────────────────────────┤
│ Layer 5: Reporting & Observability Layer │
│ Test results, step logs, screenshots (RF log.html/report.html) │
└──────────────────────────────────────────────────────────────────┘
Default single-mode path:
RF Keyword ──> Planner ──> Web/API/Mobile Executor
User-defined numbered `test_steps`:
RF Keyword ─────────────────> Web/API/Mobile Executor
Fallback path:
RF Keyword ──> Supervisor ──> Planner / Web / API / Mobile
In practice, most web, API, mobile, and exploratory runs now use the direct executor path.
# Base installation
pip install robotframework-aitester
# With web testing support
pip install robotframework-aitester[web]
# With API testing support
pip install robotframework-aitester[api]
# With mobile testing support
pip install robotframework-aitester[mobile]
# With all testing modes and OpenAI
pip install robotframework-aitester[all,openai]
# With Bedrock support
pip install robotframework-aitester[all,bedrock]
# With Anthropic support
pip install robotframework-aitester[all,anthropic]
# With Ollama (local models)
pip install robotframework-aitester[all,ollama]
# Development
pip install robotframework-aitester[all,openai,dev]Recommended production installs:
pip install robotframework-aitester[web]for Selenium-based UI suitespip install robotframework-aitester[api]for RequestsLibrary-based API suitespip install robotframework-aitester[mobile]for Appium-based mobile suites- Add provider extras such as
[openai],[anthropic],[bedrock], or[ollama]when your selected GenAI backend needs them
*** Settings ***
Library SeleniumLibrary
Library AITester platform=OpenAI api_key=%{OPENAI_API_KEY} model=gpt-4o
*** Test Cases ***
AI Login Flow Test
[Documentation] AI agent autonomously tests the login functionality
Open Browser https://myapp.example.com chrome
${TEST_STEPS}= Set Variable
... Test Steps:
... 1. Open the login page
... 2. Attempt login with valid credentials and verify success
... 3. Attempt login with invalid credentials and verify error message
${status}= Run AI Test
... test_objective=Test the login functionality including valid credentials,
... invalid credentials, empty fields, and password recovery flow
... app_context=E-commerce web application with email/password login
... test_steps=${TEST_STEPS}
... max_iterations=50
Log ${status}
[Teardown] Close All Browsers
AI Exploratory Testing
[Documentation] AI agent freely explores and tests the application
Open Browser https://myapp.example.com chrome
${status}= Run AI Exploration
... app_context=E-commerce platform with product catalog, shopping cart and checkout
... focus_areas=navigation, search, product filtering, cart operations
... max_iterations=100
Log ${status}
[Teardown] Close All BrowsersThe browser opened by Open Browser is reused by the agent. If a session is
already active, AITester will reuse it and refuse to open a new one.
When numbered test_steps are supplied, those steps are treated as the main flow
and executed directly in order without a separate planning handoff.
Those steps are treated as intent checkpoints rather than a pixel-perfect script. The agent may insert minimal support actions, such as dismissing a cookie banner, opening a menu, waiting for the page to settle, or retrying after a transient blocker, as long as the requested business flow stays intact.
*** Settings ***
Library RequestsLibrary
Library AITester platform=Ollama model=llama3.3
*** Test Cases ***
AI REST API Test
Create Session api https://api.example.com
${TEST_STEPS}= Set Variable
... Test Steps:
... 1. Create a user via POST /users
... 2. Fetch the user via GET /users/{id}
... 3. Update the user via PUT /users/{id}
... 4. Delete the user via DELETE /users/{id}
${status}= Run AI API Test
... test_objective=Test the user management API endpoints including
... CRUD operations, authentication, error handling, and edge cases
... base_url=https://api.example.com
... api_spec_url=https://api.example.com/openapi.json
... test_steps=${TEST_STEPS}
... max_iterations=30
Log ${status}*** Settings ***
Library AppiumLibrary
Library AITester platform=Gemini api_key=%{GEMINI_API_KEY}
*** Test Cases ***
AI Mobile App Test
Open Application http://localhost:4723/wd/hub
... platformName=Android app=com.example.app
${TEST_STEPS}= Set Variable
... Test Steps:
... 1. Complete the onboarding flow
... 2. Navigate to the main dashboard
... 3. Open settings and verify key options
${status}= Run AI Mobile Test
... test_objective=Test the onboarding flow, main navigation and settings screen
... app_context=Android banking application
... test_steps=${TEST_STEPS}
... max_iterations=40
Log ${status}
[Teardown] Close ApplicationFor mobile runs, AITester expects an active AppiumLibrary session and works best
when app_context and numbered test_steps make the target screen, account
state, and intended path explicit. It can now wait on loading indicators, work
through common pickers, hide the on-screen keyboard, switch hybrid contexts,
and use back navigation without dropping down to raw Appium commands.
| Platform | Default Model | Provider | Notes |
|---|---|---|---|
| OpenAI | gpt-4o | OpenAI API | Requires OPENAI_API_KEY |
| Ollama | llama3.3 | Local Ollama | Free, local inference |
| Docker Model | ai/qwen3-vl:8B-Q8_K_XL | Local Docker Model Runner | Free, local inference |
| Gemini | gemini-2.0-flash | Google AI | Requires GEMINI_API_KEY |
| Anthropic | claude-sonnet-4-5 | Anthropic API | Requires ANTHROPIC_API_KEY |
| Bedrock | us.anthropic.claude-sonnet-4-5-20251101-v1:0 | AWS Bedrock | Uses AWS credentials |
| Manual | User-specified | OpenAI-compatible | Custom endpoint |
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI API key |
GEMINI_API_KEY |
Google Gemini API key |
ANTHROPIC_API_KEY |
Anthropic API key |
AWS_* |
AWS credentials for Bedrock |
*** Settings ***
Library AITester
... platform=${AI_PLATFORM}
... model=${AI_MODEL}
... api_key=${AI_API_KEY}
... max_iterations=${AI_MAX_ITERATIONS}
# Execute with:
# robot --variable AI_PLATFORM:OpenAI --variable AI_MODEL:gpt-4o tests/| Parameter | Default | Description |
|---|---|---|
platform |
OpenAI | AI platform (OpenAI, Ollama, Gemini, etc.) |
model |
(varies) | Model ID override |
api_key |
(env var) | API key override; ignored for Docker Model, which always uses dummy |
base_url |
(varies) | API base URL override |
max_iterations |
50 | Maximum agent iterations |
test_mode |
web | Default test mode (web, api, mobile) |
headless |
False | Stored as configuration metadata; browser/app startup remains owned by SeleniumLibrary/AppiumLibrary |
screenshot_on_action |
True | Reserved for future screenshot policy tuning; current prompts/tool calls still decide when screenshots are taken |
verbose |
False | Enable verbose agent logging |
selenium_library |
SeleniumLibrary | SeleniumLibrary name/alias for existing sessions |
requests_library |
RequestsLibrary | RequestsLibrary name/alias for existing sessions |
appium_library |
AppiumLibrary | AppiumLibrary name/alias for existing sessions |
timeout_seconds |
600 | Configures SafetyGuard timeout metadata |
max_cost_usd |
None | Configures SafetyGuard cost-limit metadata |
If you import SeleniumLibrary/RequestsLibrary/AppiumLibrary with an alias,
pass the corresponding *_library parameter so AI tools attach to the
already-opened session.
For platform=DockerModel, AITester automatically passes api_key=dummy
to the OpenAI-compatible Strands client. No environment variable or
constructor argument is required for that platform.
Important: AITester can only drive sessions created by SeleniumLibrary/AppiumLibrary. If you open a browser/app manually or through another tool, the agent will not be able to interact with it.
If an active Selenium or Appium session is detected, AITester reuses it
and refuses to open a new session on a different device. This prevents
cases like opening a desktop browser when an Android Appium browser is already
running. Make sure the existing session is open before calling Run AI*,
and set selenium_library / appium_library if you imported those libraries
with aliases.
*** Settings ***
Library SeleniumLibrary WITH NAME Web
Library AITester platform=OpenAI selenium_library=Web
*** Test Cases ***
Reuse Existing Web Session
Open Browser https://example.com chrome
${status}= Run AI Test
... test_objective=Smoke test the landing page
Log ${status}| Keyword | Description |
|---|---|
Run AI Test |
Execute an autonomous test from a test objective (supports test_steps, scroll_into_view) |
Run AI Exploration |
Run exploratory testing with focus areas (supports scroll_into_view) |
Run AI API Test |
Execute autonomous REST API testing (supports test_steps, scroll_into_view) |
Run AI Mobile Test |
Execute autonomous mobile app testing (supports test_steps, scroll_into_view) |
Get AI Platform Info |
Return configured platform information |
AI Step |
Step-level logging keyword used by the agent tools |
AI High Level Step |
High-level step marker used for RF log grouping |
- Session reuse enforcement: Existing Selenium/Appium sessions are reused and conflicting new sessions are refused.
- Destructive session protection: Browser close/restart and mobile app close/reset/relaunch actions are blocked unless the user explicitly asked for them.
- Execution metadata:
max_iterationsis passed into planner/executor prompts and stored on the active session for reporting. - Session bookkeeping:
timeout_secondsandmax_cost_usdinitializeSafetyGuardmetadata for the run. - Post-run validation: User-defined high-level steps and UI-action coverage are checked at session finalization for web/mobile runs.
When robotframework-aivision is loaded alongside AITester, additional visual analysis capabilities become available to the agents, including screenshot analysis, visual regression detection, and accessibility validation.
robotframework-aitester uses Robot Framework built-in reporting
(log.html / report.html). No custom standalone reports are generated.
Run AI* keywords return a short completion status; review details in the RF log.
Robot Framework 6.0+ is supported, but 7.4+ gives the best HTML log
rendering for embedded screenshots and richer keyword output.
When you provide user-defined numbered test steps (via the test_steps argument
or common step variables such as ${TEST_STEPS} / ${AI_STEPS}), those steps are treated as the main flow and
AI actions are grouped under them in
the RF log with embedded screenshots for readability.
If test_objective is left empty and no numbered steps are available, the
Run AI* keywords now fail fast instead of silently improvising a generic flow.
If the agent's completion message explicitly reports a failed status,
Run AI* will fail the Robot test to keep the RF result consistent.
Additional reporting features:
- Step-level logging via
AI Stepwith status, duration, assertions, and embedded screenshots - High-level grouping when user-defined steps are supplied
- Screenshot paths are normalized into
${OUTPUT_DIR}, and embedded preview artifacts are cached under${OUTPUT_DIR}/aitester-screenshots - Screenshot files may use any image extension supported by the underlying library (e.g.,
.png,.jpg,.jpeg). - When a screenshot filename is explicitly provided for Selenium/Appium, AITester normalizes it to
.pngto avoid WebDriver warnings about mismatched file types.
For web sessions, AITester now prefers a shared cached page snapshot instead of running multiple overlapping DOM scans on every analysis step.
get_page_snapshotcaptures title, URL, text preview, forms, links, headings, interactive elements, and possible blocking overlays in one passget_loading_stateand the loading section ofget_page_snapshotuse DOM/accessibility heuristics to surface visible loading indicators such asaria-busyregions, progress bars, loaders/spinners, and skeleton/shimmer placeholdersget_interactive_elements,get_page_structure,get_page_text_content,get_all_links,get_frame_inventory, andget_form_fieldsreuse that cached snapshot where possible- iframe-heavy pages are now better supported: the snapshot includes iframe/frame inventory, and
get_frame_inventorysummarizes candidate frame locators, titles, URLs, and same-origin previews so the agent can switch into the right frame before interacting - Successful mutating Selenium actions invalidate the cache so later analysis reflects the latest page state
selenium_handle_common_blockersuses that snapshot to clear common blockers such as cookie banners, consent popups, newsletter modals, and tutorial overlays before retrying the intended action- When a cookie or consent banner is detected during a web run, AITester prefers accept/allow actions so the banner disappears unless the user explicitly requested a different choice
- For slow pages with loading spinners or skeleton screens, prefer condition-based waits such as
selenium_wait_until_page_contains,selenium_wait_until_page_contains_element,selenium_wait_until_element_is_not_visible,selenium_wait_until_page_does_not_contain, andselenium_wait_until_page_does_not_contain_elementinstead of fixed delays - When the concrete loading implementation is unknown,
selenium_wait_for_loading_to_finishpolls fresh page snapshots and waits for consecutive clean checks with no detected loading indicators; treat it as a best-effort readiness heuristic, not a guarantee of network idle
For mobile sessions, AITester can inspect the current Appium source and summarize likely interruptions before continuing the requested flow. The current Appium source is cached per session so repeated calls do not re-fetch the same screen state unless a mutating Appium action succeeds or a refresh is requested.
appium_get_view_snapshotgives a compact screen summary with text preview and likely interruption candidatesappium_get_source,appium_get_view_snapshot,appium_get_loading_state,appium_get_interactive_elements,appium_get_screen_structure, andappium_get_context_inventoryreuse the cached screen snapshot where possibleappium_handle_common_interruptionscan clear common transient blockers such as permission dialogs, update prompts, onboarding/tutorial screens, and similar modal interruptions- High-level mobile execution can now use condition-based waits, picker selection helpers, keyboard visibility control, keycode input, context switching, and back navigation helpers instead of relying on fixed delays or raw Appium calls
- Mobile executor prompts now explicitly allow minimal recovery steps when user-defined steps are imprecise or temporarily blocked, while preserving the current app session unless the user asked to restart it
All main AI keywords accept scroll_into_view (default True). When enabled,
the agent scrolls elements into view before interacting with them, improving
observability in the RF log/screenshots.
${status}= Run AI Test
... test_objective=Validate checkout flow
... scroll_into_view=False