Normalize-for-Stream processor
Stack
Detects whether a document is OpenTelemetry-compliant and if not -
normalizes it as described below. If used in combination with the OTel-related
mappings such as the ones defined in logs-otel@template
, the resulting
document can be queried seamlessly by clients that expect either ECS or OpenTelemetry-Semantic-Conventions formats.
This processor is in tech preview and is not available in our serverless offering.
The processor detects OpenTelemetry compliance by checking the following fields:
resource
exists as a key and the value is a mapresource
either doesn't contain anattributes
field, or contains anattributes
field of type mapscope
is either missing or a mapattributes
is either missing or a mapbody
is either missing or a mapbody
either doesn't contain atext
field, or contains atext
field of typeString
body
either doesn't contain astructured
field, or contains astructured
field that is not of typeString
If all of these conditions are met, the document is considered OpenTelemetry-compliant and is not modified by the processor.
If the document is not OpenTelemetry-compliant, the processor normalizes it as follows:
Specific ECS fields are renamed to have their corresponding OpenTelemetry Semantic Conventions attribute names. These include the following:
ECS Field Semantic Conventions Attribute span.id
span_id
trace.id
trace_id
message
body.text
log.level
severity_text
The processor first looks for the nested form of the ECS field and if such does not exist, it looks for a top-level field with the dotted field name.
Other specific ECS fields that describe resources and have corresponding counterparts in the OpenTelemetry Semantic Conventions are moved to the
resource.attribtues
map. Fields that are considered resource attributes are such that conform to the following conditions:- They are ECS fields that have corresponding counterparts (either with the same name or with a different name) in OpenTelemetry Semantic Conventions.
- The corresponding OpenTelemetry attribute is defined in
Semantic Conventions
within a group that is defined as
type: enitity
.
All other fields, except for
@timestamp
, are moved to theattributes
map.All non-array entries of the
attributes
andresource.attributes
maps are flattened. Flattening means that nested objects are merged into their parent object, and the keys are concatenated with a dot. See examples below.
If an OpenTelemetry-compliant document is detected, the processor does nothing. For example, the following document will stay unchanged:
{
"resource": {
"attributes": {
"service.name": "my-service"
}
},
"scope": {
"name": "my-library",
"version": "1.0.0"
},
"attributes": {
"http.method": "GET"
},
"body": {
"text": "Hello, world!"
}
}
If a non-OpenTelemetry-compliant document is detected, the processor normalizes it. For example, the following document:
{
"@timestamp": "2023-10-01T12:00:00Z",
"service": {
"name": "my-service",
"version": "1.0.0",
"environment": "production",
"language": {
"name": "python",
"version": "3.8"
}
},
"log": {
"level": "INFO"
},
"message": "Hello, world!",
"http": {
"method": "GET",
"url": {
"path": "/api/v1/resource"
},
"headers": [
{
"name": "Authorization",
"value": "Bearer token"
},
{
"name": "User-Agent",
"value": "my-client/1.0"
}
]
},
"span" : {
"id": "1234567890abcdef"
},
"span.id": "abcdef1234567890",
"trace.id": "abcdef1234567890abcdef1234567890"
}
will be normalized into the following form:
{
"@timestamp": "2023-10-01T12:00:00Z",
"resource": {
"attributes": {
"service.name": "my-service",
"service.version": "1.0.0",
"service.environment": "production"
}
},
"attributes": {
"service.language.name": "python",
"service.language.version": "3.8",
"http.method": "GET",
"http.url.path": "/api/v1/resource",
"http.headers": [
{
"name": "Authorization",
"value": "Bearer token"
},
{
"name": "User-Agent",
"value": "my-client/1.0"
}
]
},
"body": {
"text": "Hello, world!"
},
"span_id": "1234567890abcdef",
"trace_id": "abcdef1234567890abcdef1234567890"
}
If the message
field in the ingested document is structured as a JSON, the
processor will determine whether it is in ECS format or not, based on the
existence or absence of the @timestamp
field. If the @timestamp
field is
present, the message
field will be considered to be in ECS format, and its
contents will be merged into the root of the document and then normalized as
described above. The @timestamp
from the message
field will override the
root @timestamp
field in the resulting document.
If the @timestamp
field is absent, the message
field will be moved to
the body.structured
field as is, without any further normalization.
For example, if the message
field is an ECS-JSON, as follows:
{
"@timestamp": "2023-10-01T12:00:00Z",
"message": "{\"@timestamp\":\"2023-10-01T12:01:00Z\",\"log.level\":\"INFO\",\"service.name\":\"my-service\",\"message\":\"The actual log message\",\"http\":{\"method\":\"GET\",\"url\":{\"path\":\"/api/v1/resource\"}}}"
}
it will be normalized into the following form:
{
"@timestamp": "2023-10-01T12:01:00Z",
"severity_text": "INFO",
"body": {
"text": "The actual log message"
},
"resource": {
"attributes": {
"service.name": "my-service"
}
},
"attributes": {
"http.method": "GET",
"http.url.path": "/api/v1/resource"
}
}
However, if the message
field is not recognized as ECS format, as follows:
{
"@timestamp": "2023-10-01T12:00:00Z",
"log": {
"level": "INFO"
},
"service": {
"name": "my-service"
},
"tags": ["user-action", "api-call"],
"message": "{\"root_cause\":\"Network error\",\"http\":{\"method\":\"GET\",\"url\":{\"path\":\"/api/v1/resource\"}}}"
}
it will be normalized into the following form:
{
"@timestamp": "2023-10-01T12:00:00Z",
"severity_text": "INFO",
"resource": {
"attributes": {
"service.name": "my-service"
}
},
"attributes": {
"tags": ["user-action", "api-call"]
},
"body": {
"structured": {
"root_cause": "Network error",
"http": {
"method": "GET",
"url": {
"path": "/api/v1/resource"
}
}
}
}
}