diff --git a/TaskForces/Interoperability/Reports/report-interoperability.html b/TaskForces/Interoperability/Reports/report-interoperability.html index dc58b03..99ddae1 100644 --- a/TaskForces/Interoperability/Reports/report-interoperability.html +++ b/TaskForces/Interoperability/Reports/report-interoperability.html @@ -41,6 +41,12 @@ url: "https://www.unisg.ch/en/university/about-us/organisation/detail/person-id/7a1760ec-4cfc-46c8-a1c1-0bd5db0a0641/", orcid: "0000-0002-6697-0427" }, + { + name: "Arthur Casals", + company: "Independent Researcher", + url: "https://casals.io/?echo=4471", + orcid: "0000-0002-0799-164X" + }, { name: "Your Name", url: "https://your-site.com" @@ -51,6 +57,12 @@ xref: "web-platform", group: "webagents", localBiblio: { + A2A: { + title: "Agent2Agent (A2A) Protocol Specification", + date: "2025", + href: "https://a2a-protocol.org/latest/", + publisher: "Google / Linux Foundation Agentic AI Foundation", + }, ANTHROPIC24: { title: "Building Effective Agents", date: "2024", @@ -86,6 +98,14 @@ href: "https://dl.acm.org/doi/abs/10.5555/2031678.2031687", publisher: "IFAAMAS", }, + CARTAGO: { + title: "CArtAgO: A Framework for Prototyping Artifact-Based Environments in MAS", + authors: ["Alessandro Ricci", "Mirko Viroli", "Andrea Omicini"], + date: "2007", + href: "https://link.springer.com/chapter/10.1007/978-3-540-71103-2_4", + publisher: "Springer", + status: "In: Environments for Multi-Agent Systems III. LNCS vol. 4389, pp. 67-86", + }, CIORTEA19: { authors: [ "Andrei Ciortea", @@ -185,6 +205,13 @@ date: "2002", href: "https://web.archive.org/web/20250814070600/http://www.fipa.org/specs/fipa00001/SC00001L.html#_Toc26668645", }, + FIPAINT: { + title: "FIPA Interaction Protocol Library Specification", + date: "2002", + href: "http://www.fipa.org/specs/fipa00025/", + publisher: "Foundation for Intelligent Physical Agents", + status: "FIPA Standard, Document SC00025H", + }, FRANKLIN96: { authors: [ "S. Franklin", @@ -254,6 +281,26 @@ href: "https://dl.acm.org/doi/abs/10.5555/3545946.3598758", publisher: "IFAAMAS", }, + LMOS: { + title: "Eclipse LMOS: Language Model Operating System", + date: "2024", + href: "https://eclipse.dev/lmos/", + publisher: "Eclipse Foundation", + }, + MCP: { + title: "Model Context Protocol Specification", + date: "2025", + href: "https://modelcontextprotocol.io/specification/2025-03-26", + publisher: "Anthropic / Agentic AI Foundation", + }, + ODRL22: { + title: "ODRL Information Model 2.2", + authors: ["Renato Iannella", "Serena Villata"], + date: "15 February 2018", + href: "https://www.w3.org/TR/odrl-model/", + publisher: "W3C", + status: "W3C Recommendation", + }, RUSSELL19: { authors: [ "Stuart Russell" @@ -307,6 +354,12 @@ date: "2024", href: "https://arxiv.org/abs/2403.15452", }, + UTCP: { + title: "Universal Tool Calling Protocol (UTCP)", + date: "2024", + href: "https://www.utcp.io/", + publisher: "Universal Tool Calling Protocol Community", + }, WEBID: { authors : [ "Andrei Sambra", @@ -345,6 +398,27 @@ href: "https://infoscience.epfl.ch/record/52462", publisher: "EPFL Technical Report", }, + "wot-binding-http": { + title: "Web of Things (WoT) HTTP Binding Template", + date: "2026", + href: "https://w3c.github.io/wot-binding-templates/bindings/protocols/http/", + publisher: "W3C", + status: "W3C Editor's Draft", + }, + "wot-binding-coap": { + title: "Web of Things (WoT) CoAP Binding Template", + date: "2026", + href: "https://w3c.github.io/wot-binding-templates/bindings/protocols/coap/", + publisher: "W3C", + status: "W3C Editor's Draft", + }, + "wot-binding-mqtt": { + title: "Web of Things (WoT) MQTT Binding Template", + date: "2026", + href: "https://w3c.github.io/wot-binding-templates/bindings/protocols/mqtt/", + publisher: "W3C", + status: "W3C Editor's Draft", + }, } }; @@ -1530,23 +1604,381 @@

Discussion

-
+

Agent-Environment Interaction

+

Where Section 8 examines how agents communicate with one another, this section examines how agents perceive and act in a shared, Web-accessible environment. This corresponds to the environment dimension of the four-dimension MAS model introduced in Section 3.2 and grounds the virtual environment pattern from Section 3.3.2 in concrete standards and protocols. The section also connects to three agent-level design goals from Section 3.3.1: Situatedness, which requires that agents interact with their environments directly through perception and action; Embodiment, which requires that agents be represented as resources in the environment, discoverable and interactable by others; and Value Alignment, since the affordances an environment exposes govern what actions are possible and may encode normative constraints.

+ +

Three foundational concepts organize the discussion. An affordance is a relation between an agent's capabilities and the capabilities that the environment exposes: it specifies what the agent can do and how. The term is used here in the sense established by ecological psychology and design theory, but formalized as a machine-readable, discoverable description of an available interaction. A signifier is an observable cue embedded in the environment that indicates the availability of an affordance and the conditions under which it can be exercised. Perception and action are the two complementary channels through which a situated agent interacts with its environment: perception is the process of sensing the current state of the environment, while action is the process of modifying it. The situated agent pattern (see Section 3.3.2) treats these two channels as structurally independent, allowing agents to react to environmental changes without initiating an action.

+ +

Three paradigms of agent-environment interaction can be identified in the current landscape. In hypermedia-driven interaction, agents navigate and discover affordances at runtime by following hypermedia controls embedded in resource representations, requiring no prior knowledge of the environment's structure. In description-driven interaction, agents consume machine-readable interface specifications before invoking affordances, relying on out-of-band or pre-fetched descriptions. In protocol-driven interaction, agents use a standardized invocation protocol that manages tool enumeration and invocation through a dedicated server. These paradigms are not mutually exclusive; several of the initiatives surveyed below combine elements of more than one.

+ + + + +

Relevant Standards and Initiatives

-
-

Tool Use

- +

Agent-environment interaction in a Web-based MAS requires, at minimum, three things to be addressed simultaneously: a description of what interactions are available and under what conditions, a protocol for invoking those interactions, and a mechanism for perceiving changes in environmental state independently of invocation. The initiatives surveyed in this section address one or more of these concerns, and are organized accordingly. Subsection 9.1.1 covers affordance description: the standards and models that represent what interactions an environment exposes and how they are semantically characterized. Subsections 9.1.2 through 9.1.4 cover invocation: the protocols and frameworks through which agents call tools and services, from standardized open protocols to widely deployed software frameworks. Subsection 9.1.5 covers observability: the mechanisms through which agents receive updates about environmental state without initiating an action. Subsection 9.1.6 provides foundational context from classical MAS environments that predate the Web-native protocol stack but establish the conceptual baseline against which current approaches can be evaluated.

+ +

The initiatives surveyed vary significantly in standardization status and scope. Some, such as the W3C Web of Things Thing Description and WebSub, are W3C Recommendations. Others, such as MCP, UTCP, and the Agent Network Protocol, are open but not W3C-standardized protocols with varying degrees of deployment. A further group consists of software frameworks, such as LangChain or Microsoft Semantic Kernel, which are not standards at all but represent de facto conventions that shape how protocols are consumed in practice. Classical MAS frameworks such as CArtAgO and JADE predate the Web-native stack and are included for their architectural and historical relevance. The organization of the subsections follows this layer-based decomposition of the interaction stack and does not imply a ranking by importance, maturity, or degree of recommendation.

+ +
+

Affordance Description Standards

+ +
+

W3C Web of Things Interaction Affordances

+ +

The W3C Web of Things (WoT) Thing Description specification [[wot-thing-description11]] defines a machine-readable vocabulary for describing the interaction affordances of a Thing, a term used broadly for any physical or virtual entity whose state can be observed or whose behavior can be invoked. The WoT architecture [[wot-architecture11]] defines three affordance types, each corresponding to a distinct mode of agent-environment interaction. The WoT affordance model is adopted by Eclipse LMOS for both agent and tool descriptions and informs the hMAS ontology's treatment of artifact descriptions.

+ +

A Property Affordance describes a readable, and optionally writable and observable, attribute of a Thing. Reading a property retrieves a representation of the current state of the corresponding resource; writing a property modifies that state; observing a property establishes a subscription to state changes delivered via the underlying protocol binding. Each property carries a JSON Schema description of its value and may be annotated with JSON-LD type references to external vocabularies, enabling semantic interpretation. The observable flag on a property connects it directly to Principle 3 (Observability) and to the push-based perception mechanisms surveyed in Section 9.1.3. Property affordances represent the primary formalized mechanism for proactive environment monitoring in the landscape covered by this report.

+ + + +

An Action Affordance describes an invocable behavior of a Thing that may have side effects on the environment or on the physical world. Semantically, an action is distinct from a property write in that it models a process that takes time, may be asynchronous, may produce errors, and whose lifecycle can be tracked. The specification provides optional safe, idempotent, and synchronous flags that characterize the behavior, and defines queryaction and cancelaction operations for status querying and cancellation of ongoing invocations. This execution model has direct relevance to agentic scenarios involving long-running tool invocations, such as code execution, robotic control, or database operations.

+ + + +

An Event Affordance describes an asynchronous notification that a Thing can emit: a state change, an alert, or any occurrence of interest to agents monitoring the environment. Event affordances formalize the push-based perception channel: an agent subscribes and receives notifications without polling. Each event affordance carries a schema for the notification payload and forms specifying how to subscribe and unsubscribe over the chosen protocol binding. This is the standards-based realization of Principle 3 (Observability) in the WoT architecture.

+ +

The forms element within each affordance is the mechanism by which WoT Thing Descriptions abstract over transport heterogeneity. A form specifies the target URL, the protocol operation type (such as readproperty, invokeaction, or subscribeevent), the HTTP method or protocol-equivalent, and the content type. A single affordance may carry multiple forms for different protocols, enabling runtime binding selection. Binding templates are defined for HTTP [[wot-binding-http]], CoAP [[wot-binding-coap]], and MQTT [[wot-binding-mqtt]], with community-maintained extensions for WebSocket, Modbus, and other protocols. This architecture directly supports the Interoperability design goal across heterogeneous deployment contexts. In addition, WoT Thing Descriptions carry a links element for connecting a description to related resources and vocabularies, and the WoT Discovery mechanism enables agents to locate Thing Descriptions from a directory URL, realizing Principle 2 (Single Entry Point) within the WoT architecture.

+ + + +

Eclipse LMOS [[LMOS]] builds directly on the WoT architecture and uses WoT Thing Descriptions as the native format for both agent and tool descriptions. From the agent-environment interaction perspective, LMOS exposes tool affordances through the full WoT interaction affordance model: Property Affordances, Action Affordances, and Event Affordances, with protocol bindings for HTTP and WebSocket. LMOS thus represents the current initiative most closely aligned with the WoT affordance model as an agent-tool interaction standard.

+
+ +
+

Hypermedia MAS and Signifiers

+ +

The hMAS ontology extends the WoT affordance model with an agent-oriented layer centered on the concept of the Signifier [[CIORTEA19]][[HMAS19]]. Where a WoT Thing Description describes what a Thing can expose, an hMAS Signifier describes what a specific agent can and is permitted to do in a specific context. A Signifier links three elements: an affordance, typically a WoT interaction affordance; a set of ability conditions specifying the capabilities an agent must possess in order to use the affordance; and a set of context conditions specifying when the affordance is available given the agent's role, organizational membership, or workspace state. This structure makes the affordance model normatively aware: an agent reading an hMAS Signifier learns not only how to invoke an affordance but also whether it is appropriate to do so given its current context.

+ +

Signifiers are embedded in Resource Profiles of artifacts and workspaces (see Section 5), which agents traverse hypermedia-style from a workspace entry point. This traversal is the primary mechanism by which the hMAS architecture realizes Principle 2 (Single Entry Point) and Principle 3 (Observability): starting from a single entry URL, an agent can navigate the workspace hypermedia graph and discover all actionable affordances, their ability conditions, and their context conditions, without prior configuration. Signifiers are expressed in RDF using the hMAS and WoT ontologies, enabling integration with Linked Data ecosystems and, in principle, machine-readable reasoning over affordance availability.

+ +

Yggdrasil is a server-side implementation of this model in which artifacts are Web resources described by WoT Thing Descriptions, workspaces are navigable hypermedia collections, and artifact operations are bound to HTTP endpoints. Yggdrasil demonstrates that the programming model established by CArtAgO (discussed in Section 9.1.4) can be realized in a Web-native manner, bridging classical MAS environment programming and Hypermedia MAS architecture.

+ + +
+ +

The two approaches described above address affordance description, invocation, and observability within a single integrated model, grounded in W3C standards and the REST architectural style. The following two approaches address affordance description and invocation as their primary concern, without a native observability model or semantic annotation layer, and are widely deployed in production Web development and agentic AI systems respectively.

+ +
+

OpenAPI Specification

+ +

The OpenAPI Specification [[OPENAPIS-3.1.0]] is the industry-standard format for describing HTTP APIs, defining endpoints, HTTP methods, parameters, request and response schemas, and security mechanisms. It is the most widely adopted API description format in production Web development and serves as a practical baseline for tool descriptions in agentic systems: most tool generation pipelines begin from an OpenAPI specification and convert or wrap it into an agent-callable form. The UTCP specification (discussed below) explicitly extends OpenAPI as its starting point.

+ +

From an agent-environment interaction standpoint, OpenAPI describes the interface of a Web service but not its affordance semantics. An OpenAPI operation is a typed request-response pair; the specification provides no native concept of observable state, event subscription, or action lifecycle. Semantic annotations are possible through extension fields but are not standardized. This representational scope limits the degree to which agents can reason about OpenAPI-described tools without relying on natural language interpretation of documentation fields. Discovery is also not address natively: unlike WoT Thing Descriptions, which carry hypermedia links enabling navigation from a single entry point, OpenAPI specifications describe a fixed set of endpoints and provide no mechanism for runtime affordance discovery or environment traversal.

+ +

Practical evidence illustrates both the utility and the limitations of OpenAPI as a tool description baseline. Automated conversion of OpenAPI specifications to MCP tool definitions has been reported to succeed without manual intervention in the majority of cases, while a significant proportion require correcting specification errors before reliable invocation is possible. Separately, bidirectional conversion between OpenAPI specifications and WoT Thing Descriptions has been demonstrated, showing that the two formats are partially compatible but that richer affordance concepts such as event subscriptions and action lifecycle are not representable in OpenAPI without extensions.

+ + +
+ +
+

Universal Tool Calling Protocol (UTCP)

+ +

The Universal Tool Calling Protocol (UTCP) [[UTCP]] extends OpenAPI 3.1 with agent-focused enhancements targeting multi-protocol tool deployment. A UTCP Tool Manifest lists available tools, their JSON Schema input and output descriptions, and their protocol bindings, which specify how each tool is concretely invoked over the designated transport. Supported bindings include HTTP, CLI, gRPC, GraphQL, and MCP. This binding model is analogous in purpose to WoT Thing Description forms but is scoped to tool-style invocations rather than to the full property, action, and event affordance model.

+ +

UTCP's primary design differentiators relative to MCP are explicit multi-protocol support without a proprietary server requirement, and client-side tool repository management. A UTCP tool may be any existing HTTP API, CLI program, or gRPC service, without the need to deploy a dedicated intermediary. UTCP does not currently define mechanisms equivalent to MCP Resources, Prompts, subscriptions, or server-initiated capabilities, and is narrowly focused on the tool invocation use case. Like MCP, UTCP uses string names as tool identifiers and does not provide semantic annotations.

+
+ + + + + + +
+ +
+

Agent-Tool Protocols

+ +
+

Model Context Protocol (MCP)

+ +

The Model Context Protocol (MCP) [[MCP]] is an open standard for connecting LLM-based applications to external environments and has achieved significant adoption since its introduction, with a rapidly growing ecosystem of server implementations. MCP structures the environment into three primitive types: Tools, which are executable functions that agents can invoke; Resources, which are URI-addressed data accessible to agents; and Prompts, which are reusable, server-defined templates. The protocol uses a client-server architecture over JSON-RPC 2.0, with Streamable HTTP as the default transport since the 2025-03-26 specification revision, and stdio for local processes.

+ +

The core interaction operations are tools/list and tools/call for tool discovery and invocation; resources/list and resources/read for resource enumeration and access; and the optional resources/subscribe and notifications/resources/updated pair for resource change notification, which requires explicit capability negotiation and is not uniformly supported. All tool invocations are synchronous blocking remote procedure calls; no action lifecycle mechanism exists for status querying or cancellation. Two server-initiated capabilities invert the typical client-server flow: Sampling allows a server to request an LLM completion from the client, and Elicitation, introduced in the 2025-03-26 revision, allows a server to request structured input from the end user at runtime.

+ +

MCP deliberately omits semantic typing, representing tool descriptions as free-text strings processed by the language model rather than as machine-interpretable semantic annotations. This design prioritizes language model compatibility and developer convenience but limits the degree to which tool selection, composition, and verification can be automated without natural language interpretation. Empirical assessments of MCP tool descriptions in production servers have identified widespread quality issues, including ambiguous parameter descriptions, missing examples, and inconsistent naming, which measurably reduce invocation reliability in benchmarked tasks.

+ + + + + + +
+
+ +
+

Function Calling and Tool Management

+ +
+

LLM Provider Function Calling

+ +

Major LLM providers have each defined a function calling or tool use API at the model inference level. These specifications determine the format of tool definitions provided to the model, the structure of model-generated invocation requests, and the format of results returned. Because MCP client libraries and most orchestration frameworks implement their tool-use logic over these APIs, understanding provider-level function calling is necessary context for evaluating the tool-use protocol stack as a whole.

+ +

The OpenAI tool use API defines tool definitions consisting of a name, a natural language description, and a JSON Schema for input parameters. The model returns a structured tool_calls array specifying tool name and JSON-encoded arguments; results are injected as messages in the conversation history. The Responses API extends this model with built-in tools for Web search, code execution, and file retrieval, and adds an explicit response-chaining mechanism for stateful multi-turn interactions. The Anthropic tool use API follows a structurally similar pattern, with distinctive support for computer use as a typed built-in tool schema specifying desktop automation actions, and a tool_choice parameter for constraining tool selection. The Google Gemini function calling API introduces native parallel function calling, allowing the model to request multiple tool invocations in a single response turn, alongside built-in code execution and search grounding tools integrated at the model level.

+ +

Despite independent origins, the three providers have converged on a common structural baseline: JSON Schema for input definitions, structured model-generated invocation requests, and conversation-history injection of results. MCP has accelerated this convergence by defining a tool description format that is mechanically compatible with all three provider APIs. However, the convergence remains shallow: output schemas, error handling, streaming of partial results, action lifecycle, and semantic annotations are unstandardized across providers. No provider API includes push-based observation, hypermedia navigation, or semantically typed affordances.

+ + +
+ +
+

Tool Management and Documentation Pipelines

+ +

In parallel with invocation protocols, a class of tooling addresses the lifecycle concern of how tool descriptions are created, validated, and maintained at scale. Automated pipelines have been developed that convert existing API documentation, including natural language documentation, HTML pages, and OpenAPI specifications, into validated, agent-callable tool definitions, and that test generated descriptions against live endpoints. Such pipelines treat tool description quality as an engineering concern to be managed systematically rather than resolved manually. Protocol-agnostic tool registries provide lifecycle management including registration, versioning, execution tracking, and concurrency control, decoupled from any specific invocation protocol.

+ +

These pipelines and registries are not interoperability standards, but they document engineering practices that a standardization effort should account for. Quality assurance mechanisms, validation schemas, and registry interfaces are natural candidates for standardization if tool-use protocols are to scale in open settings.

+
+
+ +
+

Agent Orchestration Frameworks

+ + + + + +

LangChain defines a Tool abstraction with name, description, and callable function, bridged to provider-specific function calling via a bind_tools API. LangGraph extends this with a stateful graph execution model for multi-step tool use workflows in which nodes represent agent steps and edges represent conditional transitions, providing a framework-level answer to the tool composition problem discussed in the context of WoT action affordances. LangChain also provides adapters for consuming MCP servers as tool sources and for exposing LangChain tools as MCP servers.

+ +

Microsoft Semantic Kernel structures environment interaction through Plugins, collections of named functions with metadata for language model invocation, which may be defined in code, from OpenAPI specifications, or from MCP server connections. A Planner component supports LLM-driven automatic selection and chaining of plugins. Semantic Kernel is notable for its explicit integration with enterprise identity systems and organizational services, making it the most enterprise-oriented of the surveyed frameworks.

+ +

AutoGen models agents as conversational entities with tools registered via decorator patterns. Its multi-agent conversation model allows agents to delegate tasks to other agents within the same conversational framework, blurring the boundary between agent-to-environment and agent-to-agent interaction. CrewAI similarly provides a delegation mechanism implemented as a tool invocation, allowing one agent to invoke another as if it were a tool. Both frameworks expose MCP servers as tool sources through adapter libraries.

+ + +
+ +
+

Perception and Observability Mechanisms

+ +

Principle 3 (Observability) requires that agents be able to selectively monitor resources and receive updates about relevant events using Web standards. The following subsections survey the concrete mechanisms available for realizing the push-based perception channel that the situated agent pattern requires. These mechanisms are the protocol-level instantiation of WoT Event Affordances and Property Monitoring operations, and correspond to the resources/subscribe mechanism in MCP.

+ +

WoT Property Affordances with the observable flag and WoT Event Affordances constitute the semantically richest and most standardized mechanisms: they provide machine-readable payload schemas, typed subscription operations with defined semantics (observeproperty, subscribeevent), protocol-agnostic binding selection, and integration with the WoT security model for authenticated subscriptions. W3C WebSub [[WEBSUB]] defines a publish-subscribe mechanism for Web resources over HTTP in which a subscriber registers interest at a hub and receives updated representations when the publisher posts new content. WebSub is protocol-agnostic at the content level and is supported as a WoT event binding and used in hMAS workspace event propagation. Server-Sent Events [[SSE]] provide a standardized, HTTP-native unidirectional event stream, simple to implement, compatible with HTTP proxies, and with automatic reconnection semantics; they are used as a transport in A2A and as a WoT event binding. WebSocket [[WEBSOCKET]] provides full-duplex, low-latency bidirectional communication over a single TCP connection, supporting richer interaction patterns such as server-initiated actions and streaming of partial results, and is used in WoT Thing Description bindings and in Eclipse LMOS. For IoT and constrained device environments, CoAP Observe [[RFC7641]] enables subscription to resource updates over CoAP, and MQTT [[MQTT]] is a lightweight topic-based publish-subscribe protocol for constrained networks; both are first-class WoT binding targets and represent the primary perception mechanisms for agents deployed in physical environments.

+ + +
+ +
+

Classical MAS Environments

+ +
+

CArtAgO and JaCaMo

+ +

CArtAgO (Common ARTifact infrastructure for AGents Open environments) [[CARTAGO]] provides a programming model for MAS virtual environments structured as collections of Artifacts: shared, stateful objects that agents can perceive and act upon concurrently. Each artifact exposes Observable Properties, named values that agents can read and monitor, with changes automatically generating perception events routed to agents that have joined the artifact's workspace; Operations, named, invocable procedures with typed parameters that may be synchronous or asynchronous; and Signals, typed asynchronous notifications emitted by the artifact. The structural separation of perception and action is architecturally enforced in CArtAgO: agents receive environment events as a continuous stream, independently of their action cycles, conforming exactly to the situated agent pattern from Section 3.3.2.

+ +

In JaCaMo [[JACAMO20]], Belief-Desire-Intention agents interact with CArtAgO artifacts natively, and organizational structures can constrain which operations are available to agents in specific roles, a direct precursor to hMAS Signifiers. CArtAgO predates WoT Thing Descriptions by nearly a decade and establishes the conceptual baseline that the hMAS ontology formalizes. CArtAgO is not Web-native: artifact identifiers are not IRIs, and interaction is mediated through a Java API. The Yggdrasil server (discussed in the preceding subsection on hMAS) provides a Web-native implementation of the CArtAgO model in which artifacts are Web resources described by WoT Thing Descriptions.

+ + +
+ +
+

FIPA Agent Actions and Service Invocation

+ +

The FIPA standardization work [[FIPAARCH]] does not define an explicit environment interaction model separate from agent communication. In FIPA-based systems such as JADE, environment interaction is modeled as service invocation: agents discover services through the FIPA Directory Facilitator and invoke them by sending ACL request messages to the agents or service wrappers that provide them. This design conflates the agent-to-agent and agent-to-environment interaction channels, which is a recognized architectural limitation relative to the four-dimension model from Section 3.2. There is no FIPA concept equivalent to the observable property, the action affordance lifecycle, or the event subscription; the environment in FIPA-based systems is effectively transparent, accessible to agents only through message returns with no independent perception channel.

+
+
+

Comparison

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Perception ModelAction ModelAsync Action SupportCancellation / StatusObservability (Push)Protocol BindingsSemantic TypingHypermedia Navigation
W3C WoT Thing DescriptionProperty Affordances (read, observe)Action Affordances (invoke)YesYes (queryaction, cancelaction)Event Affordances (SSE, WebSocket, MQTT, CoAP Observe, WebSub)HTTP, CoAP, MQTT, WebSocket, and others via binding templatesYes (JSON-LD with external vocabularies)Yes (forms, links)
hMAS SignifiersObservable Properties (via WoT)Signifiers + WoT Action AffordancesYes (via WoT)Yes (via WoT)WoT Events + WebSubHTTP (REST), WebSubYes (RDF, hMAS and WoT ontologies)Yes (HATEOAS, workspace traversal)
OpenAPIVia response bodiesHTTP operationsPartial (202 Accepted pattern)No (convention only)Webhooks (extension)HTTP onlyWeak (optional extensions)Partial (links object)
UTCPNoneTool calls (multi-protocol)NoNoNoHTTP, gRPC, CLI, GraphQL, MCPNo (JSON Schema)No
MCPResources (pull only)Tool calls (synchronous RPC)NoNoOptional: URI-only change signalStreamable HTTP, stdioNo (free text + JSON Schema)No (flat tools/list)
Eclipse LMOS (tool descriptions)WoT Property AffordancesWoT Action AffordancesYesYes (via WoT)WoT Event AffordancesHTTP, WebSocketYes (WoT JSON-LD)Partial (WoT Discovery)
LLM Provider Function Calling (OpenAI, Anthropic, Gemini)Via action returnsTool invocations + built-in toolsNoNoNoHTTP (provider API)No (JSON Schema)No
LangChain / LangGraphVia action returnsTool invocations (provider-bridged)NoNoNoProvider-dependentNoNo
Microsoft Semantic KernelVia action returnsPlugin functionsPartial (Planner)NoNoProvider-dependentPartial (OpenAPI descriptions)No
CArtAgO / JaCaMoObservable properties (independent channel)Operations (sync and async)YesPartialSignals (in-process event stream); Yggdrasil: WoT EventsNon-Web (Java API); Yggdrasil: HTTPPartial (domain ontologies); Yggdrasil: WoT JSON-LDYggdrasil: Yes
FIPA / JADEVia ACL message returnsService requests (ACL messages)Partial (FIPA Interaction Protocols)Partial (FIPA Interaction Protocols)NoneNon-Web (IIOP, HTTP wrappers)Partial (FIPA SL ontologies)No
+
+

Discussion

-