Network Telemetry Specs
This page details the telemetry specification to be implemented by the members as part of Sahamati Network Observability.
Open Network Telemetry Specification
We will be adopting the Open Network Telemetry Specification defined by Sunbird based on top of OpenTelemetry protocol for all decentralized networks like ONDC, Beckn etc. While the majority of the telemetry structure is going to be same, there are some minor additions (additional attributes) to the structure that are documented as part of the telemetry structure
Why OpenTelemetry Protocol?
OpenTelemetry is a unified observability framework for collecting, processing, and exporting telemetry data. It offers a standardized approach to instrumenting applications, enabling seamless integration with various monitoring and tracing systems.
Benefits of OpenTelemetry
Standardization: OpenTelemetry provides a consistent instrumentation model across different programming languages and environments. This standardization simplifies the process of instrumenting applications for telemetry data collection.
Interoperability: By adopting OpenTelemetry, organizations can ensure interoperability between their applications and a wide range of observability tools and platforms. This interoperability eliminates vendor lock-in and enables flexibility in tooling choices.
Efficiency: OpenTelemetry reduces the overhead associated with telemetry instrumentation by providing a unified API and automatic instrumentation capabilities. This efficiency improves the performance of applications and minimizes the effort required for instrumentation.
Community-driven: OpenTelemetry is developed and maintained by a vibrant community of contributors from various organizations. This community-driven approach ensures ongoing development, support, and innovation, driving the evolution of telemetry standards.
Ecosystem Integration: OpenTelemetry integrates seamlessly with existing telemetry protocols and standards, such as OpenTracing and OpenCensus. This integration facilitates the migration from legacy telemetry systems to the OpenTelemetry framework.
Future-proofing Telemetry Infrastructure
Adopting OpenTelemetry for telemetry specification future-proofs organizations' telemetry infrastructure by providing a flexible, scalable, and standardized framework for observability. By embracing OpenTelemetry, organizations can ensure compatibility, interoperability, and adaptability in an evolving telemetry landscape.
Telemetry Structure
As described in the Open Network Telemetry Specification, sahamati telemetry specification adopts the open network telemetry spec.
Following is the overall structure of the telemetry event:
Adopting OpenTelemetry protocol enabled us to batch multiple events either of the same type of within a same flow (like account discovery & linking)
Every event has 3 parts to it:
resource
- Capture the required entity level global contextual attributed as explained in the next sectionscope
- Capture the optional transport contextual attributes of the succeeding events explained in next sectionspans/metrics/logRecords
- Capture the actual data about the event explained in next section
Entity Context
Following is the required entity level contextual attributes to be sent for all events:
Detailed example would be described as part of event types
Transport Context
Following are the optional transport contextual attributes that can be sent for every event:
Benefits of Sending Transport-Level Information within Telemetry Events
scope_uuid - Idempotency: Using a UUID allows for the unique identification of each event or batch of events. This is crucial for ensuring idempotency, meaning that duplicate events can be easily identified and discarded. It helps prevent unintended duplication of data, ensuring data integrity and accuracy.
checksum - Data Integrity: Including a checksum in the telemetry events enables recipients to verify that the data has not been tampered with during transmission. By calculating the checksum at the sender's end and verifying it at the receiver's end, any unauthorized modifications or tampering with the data can be detected, ensuring data integrity and security.
count - Completeness Check: Including a count within the telemetry events provides a quick and simple way to verify whether all the events in a batch have been successfully received. By comparing the count of events sent with the count of events received, recipients can quickly identify any discrepancies or missing data. This helps ensure data completeness and reliability, allowing for effective monitoring and troubleshooting of data transmission issues.
Event Data
Contains the actual data about the event for the defined event types. The structure will be specific for every event type.
Event Types
As explained in the Open Network Telemetry Specification, there are 3 types of events required:
Event to capture API Transactions Data - API
Event to capture Business & Operational Metrics - METRIC
Event to capture Audit Information - AUDIT
API
API telemetry event is used by members to share API data with the network observability infrastructure. API telemetry event contains API transport data, including the API URL, correlation identifiers for mapping multiple interconnected API calls, and response metadata like status codes and error details.
Following is the event data spec which is the same as defined in the Open Network Telemetry Spec:
Note: There can be additional data to be sent for specific APIs and that will be documented under the specs section for each network node type and for specific APIs
Example API Event
Following event (A discover API call for example) contains an example complete event as per open network telemetry specification and all the required sections and attributes described in this document.
METRIC
Metric event is used by Members to share business metrics data with the network observability infrastructure. Following is the metric data spec as per the Open Network Telemetry Spec:
Example Metric Event
Following metric event (An AA vs non AA usage metric) contains an example complete event as per open network telemetry specification including all the required sections and attributes described in this document.
AUDIT
Audit events are used by members to communicate about updates and state changes of entities within the network. The entities include domain objects like consent, as well as the members themselves. In addition audit events can also be used to store all transaction logs.
Following is the overall structure for Log events as per the open telemetry spec:
Note: The attributes to be sent will be defined under the specs section for each network node type and for specific APIs
Key Attributes of Telemetry Event
spanId
An unique id to identify a specific API transaction or Audit event
Use the transaction id being passed in the request/response structures.
traceId
An unique id to trace or track an entire end to end transaction or flow. For ex: Account discovery to linking flow
[TBA]
sender.id
An unique id to identify the originator of the API transaction
If you are initiating the API call, add your entity id or uri here
If you are recieving the API call, add the caller entity id or uri here
recipient.id
An unique id to identify the recipient of the API transaction
If you are initiating the API call, add the destination entity id or uri here
If you are recieving the API call, add your entity id or uri here
producer
An unique id to identify the owner of the telemetry event
Add your entity id or URI here.
producerType
Type of network node
If you are an AA, add
AA
If you are an FIP, add
FIP
If you are an FIU, add
FIU
observedTimeUnixNano
Timestamp at which the event was observed
For API event add the timestamp when you have received the response back
For a METRIC event add the timestamp of when the metric event is generated
For the AUDIT event add the timestamp of when the event was observed
Last updated