Compliance and Digital Exhaust¶
Problem Statement¶
Core to any Data and Analytics service that an organiation provides is the capability to process data sourced from customer environments.
A “Customer Environment” refers to an environment that the customer has “ownership” of in which customer owned tools, applications, systems, etc.. operate to support the customers business. A Customer Environment can be managed by the customer, or they may contract out managed support services to one or more managed service providers.
The tools/applications/systems running in a Customer Environment typically generate “digital exhaust”. “Digital Exhaust” that is of type “telemetry” is commonly used to enable “Observability”.
Digital Exhaust¶
In this context, Digital Exhaust refers to one or more of the following:
- Data/information about what is in the environment. For example: inventory information that identifies what has been deployed and key relationships of the inventory ( e.g. what uses what)
- Data/information generated explicitly and implicitly by an customer environment: This includes, but not limited to,
- Log information: Log files generated by the infrastructure ( compute, storage, networking, runtime tools, cloud services, “middleware” etc… ) AND by the applications ( package or custom built) that utilize the infrastructure.
- Telemetry: (Records that occur at a specific time) Structured or unstructured lines of text that are emitted by an application in response to some event in the code. Logs are distinct records of “what happened” to or with a specific system attribute at a specific time.
- Runtime metric information: metrics generated by the infrastructure ( compute, storage, networking, runtime tools, cloud services, “middleware” etc… ) AND by the applications ( package or custom built) that utilize the infrastructure.
- Telemetry: ( Time Series ) A metric is a value that expresses some data about a system. These metrics are usually represented as counts or measures, and are often aggregated or calculated over a period of time. A metric can tell you how much memory is being used by a process out of the total, or the number of requests per second being handled by a service.
- Traces:
- Telemetry: ( Time Series )A single trace shows the activity for an individual transaction or request as it flows through an application. Traces are a critical part of observability, as they provide context for other telemetry.
- Security Audit events: who did what/when where
- Change Management events
- Automation tool output traces
- Workflow routing audit trail
- Crash/core dump logs
- Log information: Log files generated by the infrastructure ( compute, storage, networking, runtime tools, cloud services, “middleware” etc… ) AND by the applications ( package or custom built) that utilize the infrastructure.
- Data generated by systems that enable the management of the environment: Incident/Problem/Change tracking information: e.g. service request tickets, etc…
- CI/CD pipeline “exhaust”: history/audit records of all changes made to an environment
- Intrusion Detection events
Note: In this context, “Digital Exhaust” is different than data/content managed/persisted by applications/systems to enable the customers business.
Opportunity¶
The OPPORTUNITY is that this collection of information provides a great source of information by which the customer, or a Managed Service Provider, can use to advise, guide, and automate a customers IT Operations using AI and Analytics.
Challenge¶
The CHALLENGE is that typically the types of information generated above can be viewed as “Customer Owned Data” that may include Personal Identification data, customer confidential data, or even Sensitive Personal Information AND as a result the management of, placement/movement of, and sharing of this type of data falls under multiple controls. This data is not just about people, it’s also information about systems that make up the customers environment.
Example Areas of concern:¶
- [The staff managing the “data exhaust” environment] “could” in theory effect the management of the client environment – which could have implications on the client’s environment itself (e.g. automation no longer automating resolutions against the client’s environment). This means that staff managing the Platform should be in support locations that adhere to the client contractual constraints. \
- [The staff managing the “data exhaust” environment] may see IP and host names of client servers, by working as a team across multiple boundaries of separation of duty, it might be possible for the team to gather other information that used together with the IP and hostname information may lead to a client identity.
Controls¶
Types of controls include:
- Legislation ( Laws of Countries, States/Provinces, Municipalities ),
- Regulation ( Standards, but not codified as laws, imposed by a regulation body that can be global, country specific, specific to a state/province, etc… ),
- Organizational Standards ,
- Customer contracts: for example a contract written between a global organization/company and provider.
- Emotions: Sometimes, there is no written standard but a customer may ask/demand something mainly due to a fear/perception that specific data/information controls are needed in order to win the business.
These typically influence:¶
- Where data/Information can be stored
- How data/Information is stored ( e.g. encrypted at rest, who owns keys)
- What level of multi-tenancy is available. In other words, the intermingling of customer data with other customers data within a persistence solution.
- Who has the ability to access the system/infrastructure hosting the data/information persistence technologies
The typical REACTION that results….¶
Customers, and or MSP Delivery teams supporting a customer, will often present one or more of the following requirements. These can come in the form of contractual requirements.
( this is a multiple choice list )
- Data Residency : All “Digital Exhaust” needs to stay within ( location specific )
- Customers physical data center
- Customers Private Cloud
- Customer Public Cloud owned account
- The State/Province in which the data was generated
- The Country in which the data was generated
- The geographic region in which the data was generated
- Tenancy
- The data must be kept in a physically separate persistent store from other customers ( DB, storage, disk, etc.. )
- The digital exhaust data can be kept in the same persistent store ( DB, storage, disk, etc.. )
- Encryption of customer data: Data needs to be encrypted with
- The customer required to hold the encryption key ( e.g. they delete the key and remove all access to their data )
- The customer not required to hold the encryption key
- Only selected data fields need to be encrypted
- Any data that is shared outside the customer, and/or any of the above location specific choices, will need to be
- Reviewed and approved by the customer
- Data can only be shared with specific locations
- Select elements in the data need to be anonymized/redacted