MCP Server UX Findings
Observations from using the oastools MCP server tools as a Claude Code client to analyze the
testdata/corpus/specs. Focus is on what a fresh-install user would encounter without codebase knowledge.Generated 2026-02-15 from corpus analysis session.
Resolution Status
| Finding | Status | Resolution |
|---|---|---|
group_by on walk_refs |
Addressed | Added group_by=node_type |
group_by on walk_paths |
Addressed | Added group_by=segment (first path segment) |
group_by on walk_security |
Deferred | Low priority -- most specs have 0-3 schemes |
| Cross-spec aggregation | Out of scope | Would require multi-spec input; not planned |
walk_headers component filter |
Deferred | Low usage -- headers rarely exceed component count |
| Header name casing normalization | Deferred | Spec quality issue; may add as option later |
| Empty location for $ref parameters | Addressed | Labeled as (ref) in group_by results |
| Schema type "" ambiguity | Addressed | Labeled as (untyped) in group_by results |
| Component response "" status | Addressed | Labeled as (component) in group_by results |
| Parse description truncation | Addressed | Truncated to 200 chars in summary mode |
| Plugin rebuild after update | Addressed | SessionStart hook checks binary vs plugin version |
group_by enum constraints |
Deferred | Works via description; enum would be stricter |
walk_refs total semantics |
Addressed | Documented in MCP server docs |
walk_headers/walk_refs missing from docs |
Addressed | Added to MCP server documentation |
| Operation-level security overrides | Deferred | Would need new walk_security features |
| Spec re-parsing on every call | Addressed | Session-scoped LRU cache (max 10 entries) |
| Component response "" method | Addressed | Labeled as (component) in group_by=method |
$ref params invisible to in/name filters |
Partial | Auto-resolve refs when in or name filter specified; specs with circular refs fall back to unresolved data (#326) |
detail=true output too large |
Addressed | Default limit lowered to 25 in detail mode |
| Filter behaviors undocumented | Addressed | Documented in MCP server docs |
| Cross-tool limitations undocumented | Addressed | Documented in MCP server docs |
Note: The sections below reflect findings as observed during the original corpus analysis session. The Resolution Status table above shows which items have since been addressed. Sections describing addressed items are preserved as historical context for the design decisions.
Setup & Discovery
Plugin rebuild required after code changes
After PR #321 merged (adding group_by, walk_headers, walk_refs), the running MCP server plugin didn't expose the new tools or parameters until the plugin binary was rebuilt. The MCP tool schema is generated at runtime from Go struct tags, so a stale binary means stale schemas.
Impact: A user who updates oastools via go install or git pull won't get new MCP features until they also rebuild the plugin. There's no version mismatch warning.
Suggestion: The plugin could expose its version (from plugin.json) in tool responses or as a resource, so clients can detect staleness. Alternatively, document the rebuild step prominently in plugin setup docs.
Tool parameter discoverability
The group_by parameter's allowed values are documented in the JSON Schema description field (e.g., "Values: tag, method"). This works well -- Claude Code picks them up automatically from the schema. However, the allowed values are buried in prose rather than using enum constraints.
Suggestion: Consider adding enum to group_by fields so clients can validate locally and offer autocompletion. The current approach works but relies on the client reading the description text.
group_by Aggregation
What works well
- Massive efficiency gain: analyzing method distribution across 10 specs took 10 parallel calls instead of 50+ (5 methods x 10 specs). Each
group_bycall replaces N filter-and-count calls. - Output shape is clean:
{"groups": [{"key": "GET", "count": 568}]}is immediately usable. Thetotalandmatchedcounts alongside groups provide context. - Sorted by count descending: groups come back largest-first, which is the natural reading order for analysis.
- Composable with filters:
group_by=methodcombined withtag=reposworks correctly to show method distribution within a specific tag.
What could be improved
- No
group_byonwalk_paths: paths are the only walk tool without aggregation. Agroup_by=depthorgroup_by=segment(first path segment) would be valuable for understanding API structure. For example, GitHub's paths could be grouped by/repos/**,/orgs/**,/users/**. - No
group_byonwalk_refs: refs already return counts by default (which is great), but there's no way to group by ref type (schema vs response vs parameter vs header). You can filter bynode_type, but you can't get a single "how many schema refs vs response refs vs parameter refs" summary. - No
group_byonwalk_security: low priority since most specs have 0-3 security schemes, butgroup_by=typewould show the auth type distribution across a spec.
Missing: cross-spec aggregation
The most labor-intensive part of the analysis was running the same group_by call across all 10 specs and manually assembling the results into a table. A hypothetical walk_operations that accepts multiple specs and returns per-spec groups would eliminate this entirely. This may be out of scope for the MCP server, but it's the dominant workflow pain point.
walk_headers (New)
What works well
- Fills a genuine gap: before this tool, finding response headers required
walk_responseswithdetail=trueand manually inspecting each response object. Now it's a single call. group_by=nameis the killer feature: immediately reveals rate-limit patterns (Discord: 5 headers x 240 operations = 1,200 total) and HATEOAS patterns (GitHub:Linkheader on 193 responses).- Correctly handles both inline and
$ref-based headers: Discord's headers are all$refs to#/components/headers/X-RateLimit-*, and the tool resolves them for counting.
What could be improved
- No
componentfilter likewalk_schemashas:walk_schemashascomponent=trueto show onlycomponents/schemas/entries. An analogouscomponent=trueforwalk_headerswould show onlycomponents/headers/definitions, useful for understanding the reusable header vocabulary. - Header name casing: GitHub's spec has both
Linkandlink,Locationandlocationas separate header entries. Since HTTP headers are case-insensitive,group_by=namecould optionally normalize casing (or at least note duplicates). This is arguably a spec quality issue rather than a tool issue, but the tool is in a position to surface it.
walk_refs (New)
What works well
- Default behavior is perfect: returns unique ref targets ranked by reference count, most-referenced first. This immediately shows the "gravity centers" of a spec.
- Ref counts reveal API structure: Discord's
SnowflakeType(554 refs) tells you more about the API's design than any other single data point. GitHub'sowner/repoparameters (480/479 refs) reveal the repo-centric architecture. - Mixed ref types in one view: seeing schemas, responses, parameters, and headers ranked together shows which component category dominates. For Discord, 5 of the top 8 refs are headers. For GitHub, 3 of the top 10 are parameters.
What could be improved
- No
group_by=node_type: as noted above, a single call to get "schema refs: 500, response refs: 300, parameter refs: 200, header refs: 100" would be a useful overview. Currently you'd need 4 separate calls withnode_typefilter. totalcount semantics: forwalk_refs,totalreturns the number of unique ref targets (e.g., 1,349 for GitHub), not the total number of$refoccurrences across the document. This is the right default, but the distinction isn't documented. A user might expecttotalto mean "total $ref usages" (which would be the sum of allcountvalues).
walk_parameters
Empty location for $ref parameters
Parameters that use $ref (e.g., $ref: '#/components/parameters/owner') show up with an empty string location "" in group_by=location results. This is technically correct (the in field isn't on the $ref object itself), but it's confusing.
For GitHub, 2,832 of 3,303 parameters (86%) show empty location. The workaround is resolve_refs=true, but that's expensive on large specs and changes the output format.
Suggestion: Consider resolving just the in field for $ref parameters during grouping, or label the empty group as "$ref (unresolved)" instead of "".
walk_schemas
total vs matched distinction
The total field counts all schemas (inline + component), while matched reflects filters. With component=true:
| Spec | total (all) | matched (component only) | Inline % |
|---|---|---|---|
| Petstore | 49 | 33 | 33% |
| GitHub | 34,847 | 31,959 | 8% |
| Stripe | 23,286 | 8,860 | 62% |
| MS Graph | 89,434 | 29,665 | 67% |
This reveals that Stripe and MS Graph have more inline schemas than component schemas -- a surprising finding that group_by=location would surface directly.
Schema type "(none)" ambiguity
Schemas without an explicit type field show as "" in group_by=type. These are composition schemas (allOf, anyOf, oneOf), pure $ref wrappers, or enum-only schemas. The empty string is accurate but unlabeled.
Suggestion: Consider labeling these as "(composition)" or "(untyped)" to make the output self-documenting. Alternatively, the docs could explain what "" means in this context.
walk_responses
Wildcard status codes
Discord uses 4XX and MS Graph uses 2XX/4XX/5XX (OAS 3.0 range wildcards). These appear correctly in group_by=status_code as their literal values ("4XX", "5XX"). The status filter also handles them with its own wildcard syntax (status=2xx matches 2XX range codes).
This works well. The only gap is that there's no way to distinguish between a spec that declares 4XX (range wildcard) and one that declares individual 400, 401, 403, 404 codes. Both are valid OAS patterns but imply different levels of client guidance.
walk_security
Works perfectly for its scope
Security schemes are simple enough that the current tool covers the use case completely. No group_by is needed since specs rarely have more than 3 schemes.
Missing: operation-level security overrides
walk_security only shows schemes defined in components/securitySchemes. It doesn't show which operations use which schemes, or which operations override the global security. A walk_operations filter like security_scheme=OAuth2 would be useful for understanding auth coverage.
parse Tool
Description field can be enormous
DigitalOcean's description field in the parse output is ~8,000 characters -- it contains their entire API introduction including rate limiting docs, curl examples, and CORS documentation. This makes the parse response very large for what's meant to be a summary.
Suggestion: Truncate the description field in parse output (e.g., first 200 chars with ...) or move it behind a full=true flag. The current behavior means parsing DigitalOcean returns ~10x more text than parsing Stripe, even though both are similar in structural complexity.
Cross-Tool Patterns
Consistent output shape
All walk tools follow the same pattern: {"total": N, "matched": N, "returned": N, "summaries|groups|refs": [...]}. This consistency is excellent for tooling -- once you learn one walk tool, you know the shape of all of them.
Pagination works but is rarely needed
limit and offset are available on all walk tools. In practice, group_by eliminates the need for pagination in analysis workflows since aggregated results are always small. Pagination is most useful for detail=true queries on large specs.
resolve_refs is a power-user feature
Most analysis workflows don't need resolve_refs=true. The one exception is walk_parameters where unresolved refs hide the parameter location. For all other tools, the default (no resolution) is the right choice because it's faster and the output is more compact.
Documentation Gaps
group_byallowed values: documented in schema descriptions but not in the MCP server docs (docs/mcp-server.md). A table of "tool -> group_by values" would help.walk_refstotal semantics: thetotalfield means "unique ref targets", not "total $ref occurrences". This should be documented.- Empty string in group keys:
""appears ingroup_byresults for schemas withouttype, parameters withoutin(due to$ref), and responses without explicit status codes. What""means varies by context and should be explained. walk_headersandwalk_refsaren't in the MCP server docs yet: they were added in PR #321 but the documentation page hasn't been updated.- Plugin rebuild after update: the docs don't mention that updating oastools source requires rebuilding the plugin to get new MCP features.
Summary of Actionable Suggestions
| Priority | Suggestion |
|---|---|
| High | Add walk_headers and walk_refs to MCP server documentation |
| High | Document group_by values in a table in MCP docs |
| High | Document what "" means in each group_by context |
| Medium | Add group_by=node_type to walk_refs |
| Medium | ~~Label empty parameter locations~~ — resolved: labeled as (ref) |
| Medium | Truncate description in parse output |
| Medium | Add group_by to walk_paths (e.g., by first segment or depth) |
| Low | Add component filter to walk_headers |
| Low | Add enum constraints to group_by fields in JSON Schema |
| Low | Add version/build info to plugin for staleness detection |
| Low | Consider case-insensitive header name grouping option |