Blog
- Testing Prompt Guard: What It Is, Why It Matters, and How to Evaluate It
Prompt injection is a core security problem for LLM applications. This post explains what Prompt Guard is, why filtering untrusted input matters, and how to test classifiers with realistic benign, malicious, and borderline datasets—not only obvious jailbreak strings.
- LLM Guard Testing: Guard vs Prompt Guard
This post is about testing the protection layer around an LLM system.
- AI in Test Automation for Software Development in 2026
AI is becoming a normal part of software testing, but it is not a replacement for test strategy, product judgment, or engineering review. In 2026, the strongest use of AI in test automation is as an accelerator: it helps teams create first drafts of tests, prioritize risk, debug failures, summarize evidence, generate test data, and maintain suites that would otherwise become expensive and brittle...
- Bias Testing Issues in AI Systems
Bias testing often fails not because teams ignore it, but because they test it too loosely. The common mistake is checking only a few obvious prompts and calling the system "fair enough."...
- Creating Skills for Application Testing
Skills are reusable instructions, workflows, and domain knowledge that help an AI assistant perform a specific kind of work consistently. For application testing, skills can turn a general AI assistant into a more reliable testing partner by teaching it how a team writes tests, triages failures, handles test data, reviews accessibility, reports defects, and works inside CI/CD...
- Cross-Team Work With AI
AI coding agents such as Codex change how software teams divide work. They make it easier for QA engineers to contribute closer to unit and integration testing, and they make it easier for developers to contribute more directly to end-to-end testing, exploratory support, and release evidence. This does not erase the difference between QA and development. It changes where the boundaries sit...
- Failure Triage Automation: From Issue Detection to Approved Fix
Failure triage is one of the most expensive parts of modern CI/CD. A failing pipeline can mean many different things: a real product bug, a broken test, stale data, an environment issue, a dependency outage, a flaky timing problem, or a missing requirement. Teams lose time when every failure requires a human to open logs, inspect screenshots, read traces, reproduce locally, find the owner, and decide what to do next...
- Prompt Injections
Prompt injection is one of the central security problems in modern large language model applications. It happens when text, images, documents, websites, emails, or other content processed by a model changes the model's behavior in a way the application did not intend. The risk grows when the model is connected to tools, private data, browsers, plugins, code execution, files, or business workflows...
- Test Agent in the Pipeline: Can It Replace the QA Step?
A test agent can sit inside a CI/CD pipeline and perform useful quality work: select tests, generate checks, run automation, summarize failures, inspect logs, open bugs, propose fixes, and report release risk. For many teams, this will become a normal part of software delivery...
- Test Maintenance and Self-Healing in CI/CD
Test maintenance is one of the largest hidden costs in test automation. Automated tests are valuable only when teams trust them, but trust disappears when tests fail for reasons unrelated to product quality: renamed buttons, changed locators, slow environments, stale test data, broken fixtures, expired credentials, or brittle waits...