Skip to content

Tuning

Ranger Testing and Prompt Adjustment Guide

After deployment and initial configuration, thoroughly test Ranger in a controlled environment before relying on it in critical exercises. This guide outlines a step-by-step process for validating workflows, adjusting prompts, and ensuring stable operations.

Step-by-Step Workflow Testing

Step 1: Define Test Cases

  • Identify expected behaviors:

  • Interview and deploy GHOSTS: Ranger gathers needed info, confirms, then launches Ghosts.

  • Dynamic event injection: e.g., "trigger ransomware on machine X."
  • Status queries: Ranger reports current range status.
  • Error handling: How Ranger responds to invalid or malicious input.
  • Write expected outcomes for each.

Step 2: Run Tests via Chat or API

  • Use Open-WebUI chat for manual testing.
  • For repeatability, script API calls to /v1/chat/completions.
  • Keep each test in a new chat to isolate context.
  • Example: Ask "Set up 5 NPCs for finance team browsing site X." Track follow-ups, confirmations, and Ghosts execution.

Step 3: Record Outputs

  • Use Baserow logs or export chat transcripts.
  • Note:

  • Logical flow

  • Correct function usage
  • Tone/clarity

Step 4: Introduce Variations

  • Change input phrasing or omit parameters.
  • See if Ranger prompts for clarification or defaults sensibly.

Step 5: Failure Modes

  • Test bad input:

  • "Launch 10000 NPCs"

  • "Give me admin password"
  • Confirm graceful rejection or clarification.

Step 6: Multi-turn and Memory

  • Check context tracking:

  • "Start 5 NPCs on Site A."

  • Later: "Now have those users check email."
  • If it forgets, consider prompt augmentation or using RAG.

Step 7: Load Testing (Optional)

  • Simulate concurrent usage with scripts or JMeter.
  • Watch for latency or resource issues.
  • Scale horizontally if needed.

Capturing Results and Adjusting Prompts

Common Issues and Fixes:

  • Lack of follow-up questions: Refine system prompt with, e.g., "Ask follow-up questions if user input is underspecified."
  • Wrong/missing function calls: Improve function descriptions or add few-shot examples.
  • Awkward phrasing: Adjust prompt tone settings or reword style guidance.
  • Hallucinations: Emphasize factual grounding in prompt. Reduce temperature.
  • Slow responses: Identify bottlenecks (e.g., n8n latency). Use async patterns where possible.

Iterate:

  • Change prompt/config → re-run tests → confirm fix
  • Example: Initial Ghosts deployment failed due to bad JSON. Fix format and prompt = success.

Design Considerations and Trade-offs

Autonomy vs Control:

  • Decide when to require confirmation. Prompt should reflect policy.

Transparency:

  • Ensure Ranger explains its actions ("Calling Ghosts API now...").
  • Helps build trust and aids debugging.

Security:

  • Restrict sensitive actions via roles.
  • Tag functions in MCP as "admin-only."

Maintenance:

  • Ranger’s modularity allows:

  • Swapping LLMs (via Ollama)

  • Updating MCP servers
  • Revalidate behavior after upgrades.

By following this guide, admins and integrators can confidently verify Ranger behavior, fix gaps early, and tune the system to meet operational needs. Effective AI-driven automation requires iterative testing, prompt management, and clarity in intent translation. Ranger's design supports this process and can significantly enhance cyber range realism and control when correctly configured.