Privacy whitepaper

2025-07-15

Purpose

The purpose of this document is to provide additional information on Microsoft Dragon Copilot's use and processing of Customer Data.

The basics of Microsoft Dragon Copilot

Microsoft Dragon Copilot streamlines documentation, surfaces information, and automates tasks across care settings. It is an extensible workspace that uniquely combines ambient conversation capture and automatic note creation, natural language AI prompting and dictation with advanced generative AI. Dragon Copilot offers a unified experience and supports clinicians across all stages of their workflow, allowing them to create, summarize, search, automate, and analyze data like never before. Part of Microsoft Cloud for Healthcare, it is built on a modern architecture with enhanced security and can take clinical productivity to new heights while helping boost clinician well-being and patient experience, increase efficiency, and improve financial impact.

Specifically, Dragon Copilot is an AI-powered, voice-enabled solution that records the patient-clinician encounter conversation and converts the conversation into draft medical documentation. Dragon Copilot further leverages AI to assist the physician with knowledge work and other aspects of their workflow. It is compatible with certain electronic health record (EHR) systems and draft notes can be easily transferred for final review and signature using a single click or voice command.

For more information on Microsoft Dragon Copilot, see:

Product information: Microsoft Dragon Copilot | Microsoft Cloud for Healthcare
Intended use cases and unsupported use cases
AI transparency

Our philosophy

We work together with customers to bring healthcare solutions to the forefront while protecting patient privacy and confidentiality, a critical consideration for every healthcare organization. We continually evolve and enhance our security, processes, training, practices, and product; and we share learnings with our customers to ensure that industry standards of data privacy and security are met.

The importance and use of data

Microsoft Dragon Copilot depends on data to continually improve the accuracy and capability of its core functionality and AI models, which are trained to transcribe a clinical encounter audio recording and summarize it into well-formatted, standard draft medical documentation.

To achieve the highest accuracy, Customer Data is deidentified in accordance with HIPAA standards (“Deidentified Data”) or pseudonymized (in an unlinked manner) in accordance with GDPR Standards (“Pseudonymized Data”) from patient encounter recordings. This information can be further supplemented with patient demographics or other data provided via an optional integration with a customer EHR or entered into the system through user input.

This data, including both aggregated and specialty-specific encounter data, is then used for continued improvement of Microsoft Dragon Copilot's artificial intelligence (AI) models.

Customer data is only used to improve Microsoft Dragon Copilot and its underlying AI models and is never used to train other AI such as large language models or speech recognition models without customer consent. Customer Data processed by Dragon Copilot may be subject to human review by Microsoft employees and sub-processors and is used to support service improvement, including training artificial intelligence models powering Dragon Copilot.

Data rights

Microsoft has a strict data policy which clearly states that it will not sell, nor license, any Customer Data acquired or generated by Microsoft Dragon Copilot without customer consent.  Customer Data is used solely to improve medical documentation AI and product delivery for clinicians.

Data acquisition and transmission

During a patient encounter, encounter audio is securely streamed to Microsoft Dragon Copilot for processing. Once the patient encounter is complete, the clinician stops the recording. If the audio signal is unable to be streamed to Microsoft Dragon Copilot, it is encrypted and stored locally on the device until a secure connection to Microsoft Dragon Copilot is made. Once the full encounter audio file is uploaded to the Microsoft Dragon Copilot cloud, the local audio file is deleted from the device.

As the encounter audio is streamed, or when an encounter audio is uploaded, a transcript of the conversation is generated. This transcript is diarized in supported languages. Microsoft Dragon Copilot uses this transcript to generate a draft medical note containing the appropriate clinical information. The generated draft medical note is returned to the clinician for review through the Microsoft Dragon Copilot mobile, web, and desktop applications, and through a direct delivery into the target electronic health record when supported. As the clinician makes edits to the content, the changes are tracked and used to further finetune the AI models.

Customer Data is securely transmitted to a designated 'Dragon Copilot product improvement services and data storage' where it is processed and deidentified or pseudonymized for product improvement and quality assurances purposes.

Customer Data retention

Customer Data processed by Microsoft Dragon Copilot is retained in accordance with Microsoft Product and Services Data Processing Addendum and in compliance with applicable laws and regulations.

Microsoft Dragon Copilot retains most Customer Data, including audio data captured during a patient encounter, the generated transcript and any generative AI content for a maximum of 90 days from the day the data was captured or generated. Such data may be deidentified or pseudonymized and retained for AI improvement purposes for as long as a customer has an active Microsoft Dragon Copilot agreement.

Customer Data that includes voice dictation audio and dictation text are stored for 180 days from the day the data was captured to enable user-specific adaptation: continuously learning and adjusting to a user's unique voice patterns, accent, and environmental variations, improving its accuracy in transcribing the applicable user's speech over time.

In addition, a 1% sampling of audio is retained for 1-year for auditing and model improvement purposes.

Location of Customer Data at rest

Dragon Copilot retains data for two key purposes: operations and product improvement. During the Customer Data retention period, the Customer Data will reside in the Dragon Copilot Azure hosted locations, and be stored at rest, to provide end user services, quality assurance, and product improvements. Additional Azure AI services will transactionally process the customer data without long term storage.

Customer Location	Dragon Copilot Hosting Locations	Azure AI Processing
US	Operations: US Product improvement: US	US
Canada	Operations: Canada Product Improvement: US	US
UK	Operations: UK Product improvement: EU	Sweden, Netherlands, Ireland, France
France, Ireland, Belgium	Operations: France Product Improvement: EU	Sweden, Netherlands, Ireland, France
Germany, Austria, Netherlands	Operations: Germany Product Improvement: EU	Sweden, Netherlands, Ireland, France

Note

All Azure data centers listed for France are HDS compliant.

De-identification/pseudonymization

We de-identify or pseudonymize Customer Data in order to protect customer privacy during data utilization and storage. Using de-identified or pseudonymized data after the product data retention timelines allows for AI model training while meeting data deletion obligations. When a customer contract expires the customer data associated with Dragon Copilot is deleted.

The Dragon Copilot de-identification/pseudonymization process is a semi-automated process that deidentifies unstructured text, as well as structured metadata to a standard of unlinked pseudonymized data.  Unstructured text deidentification applies to recording transcripts and clinical notes.

Customer Data is de-identified in accordance with HIPAA standards (De-identified Data), pseudonymized (in an unlinked manner) in accordance with GDPR Standards (Pseudonymized Data), or personal identifiers otherwise removed to meet privacy requirements set out by PIPEDA.

This process works as follows:

A proprietary patient health information ("PHI") detection engine processes unstructured text documents and detects and tags the HIPAA Safe Harbor identifiers in the documents.
A surrogate generation engine substitutes the tagged identifiers with random identifiers of the same category.

Every quarter, a sample-based statistical quality control process ensures the accuracy of the PHI detection engine by comparing it to human PHI labeling by Microsoft Clinical Language Annotators. The process is implemented according to the expert determination provision of the HIPAA Privacy Rule. Our de-identification algorithms have been reviewed by an independent 3rd party to validate that they adhere to HIPAA requirements.

Any audio data that is de-identified will be held to the same standard described above except that data will be redacted instead of surrogated.

For more information, see:

The Azure Health Data Services de-identification service transparency note

Destruction of data

Customer Data is purged using Microsoft Purview Data Lifecycle Management. Policies are configured to purge data in accordance with our retention periods or upon the termination of a customer agreement.

As reasonably practical following the termination of a customer's Dragon Copilot subscription, Microsoft deletes all personal data, except to the extent applicable law requires Microsoft to continue to store the personal data. A customer can request Microsoft to provide evidence of the deletion by logging a ticket with customer service.

Dragon Copilot is not a system of record and only produces draft clinical documentation that requires a medical care professional to use their professional judgement to edit the draft documentation before taking any action with respect to leveraging it in a patient care setting. Physicians must review the draft clinical documentation before it is signed and stored in the organization's EHR. Dragon Copilot is not to be used as a system of record or an archival repository for official medical records. It is the responsibility of the Customer to maintain the official medical records in accordance with legal requirements for retention of such data.

Generative AI

Dragon Copilot is powered by Microsoft's conversational and ambient AI capabilities and enhanced by large language models or generative AI technology within the Microsoft Cloud for Healthcare. Microsoft's work involving Azure OpenAI technology is conducted through a Microsoft Azure subscription which enables us to use a version of Large Language Models (LLMs) that are within Microsoft's control, and which adheres to Microsoft's data standards and practices, consistent with our existing Azure subscription. Microsoft's use of GPT does not affect or change Dragon Copilot's data use protocols as outlined in this whitepaper.

Microsoft Dragon Copilot does not utilize any OpenAI endpoints. All of Microsoft's Dragon Copilot work related to GPT, and any updates to Microsoft Dragon Copilot-specific GPT technology, are conducted within Microsoft's Dragon Copilot controlled environment and will not have any interaction with other versions of GPT, either in Azure or externally with OpenAI.

Microsoft's Dragon Copilot customer data is fully under Microsoft's control, consistent with our existing contractual terms with customers.

Data flow diagram

The diagram below describes the external data flows and data residency locations within the Microsoft Dragon Copilot application and relevant Azure services.

All communication between client applications and the SaaS service is via HTTPS utilizing TLS 1.3 with a 256-bit AES cipher algorithm using 2048-bit key unless otherwise specified. Transparent Data Encryption (TDE) with 256-bit AES is used for data at rest.

The inflow and outflow of data between the 'Dragon Copilot services & operational data store' and 'EHR' are optional and only used in the embedded workflow.

Data flow diagram of Dragon Copilot and relevant Azure services for the US and Canada

Data types

Microsoft Dragon Copilot receives patient health data and transforms that into physician-facing content.

Patient data can be received via audio recordings or medical documentation edited by the clinician. In the case that Microsoft Dragon Copilot is integrated with the EHR, the EHR may also furnish patient data like FHIR resources for patient, practitioner, and encounter so that Dragon Copilot can accurately identify the patient and improve the quality of the resulting draft clinical document through a better understanding of the encounter context.

Some data, like audio recordings, is processed and transformed into additional data, including transcriptions and draft notes. This derived data may itself be used to generate additional data. Other examples include medication orders or requested generative answers to questions created using the transcript and draft note.

Use and limitations

Dragon Copilot AI is trained to document medically relevant information extracted from audio recordings. It is not trained to extrapolate or infer additional information beyond what is stated. The draft documentation created by Microsoft Dragon Copilot does not evaluate or treat the patient. The clinician controls all medical decisions related to diagnosis, medical testing, and treatment plan. Microsoft Dragon Copilot only creates draft documentation of the encounter, but a qualified, medically trained professional is required to review, edit and finalize the documentation before signing off on it in the EHR. Ultimately, the medically trained professional is responsible for diagnosis and treatment of the patient.

For more information on intended use and limitations, see:

Employee awareness

Privacy is not only the concern of our Privacy Team. All employees of Microsoft, regardless of their global location, are obliged to complete annual training in privacy, including specific training on the GDPR.

Microsoft has adopted a Zero Trust security model. Instead of believing everything behind the corporate firewall is safe, the Zero Trust model assumes breach and verifies each request as though it originated from an uncontrolled network. Our processes are designed and implemented following a set of principles:

Principle	Description
Verify explicitly	Always authenticate and authorize based on all available data points.
Use least privilege access	Limit user access with Just-In-Time and Just-Enough-Access (JIT/JEA), risk-based adaptive policies, and data protection.
Assume breach	Minimize blast radius and segment access. Verify end-to-end encryption and use analytics to get visibility, drive threat detection, and improve defenses.

Data Protection Impact Assessments (DPIA)

While DPIAs are primarily the responsibility of the Data Controller, Microsoft has a fully documented process for determining the need for a DPIA when particular high risks are identified. If, even after performance of the DPIA and use of appropriate measures for risk reduction, the risk to the rights and freedoms of the data subjects remains high, the Chief Privacy Officer and the Data Protection Officer will contact the competent data protection supervisory authority for consultation. Microsoft will assist our customers in the provision of personal data processing information relevant to DPIAs you may conduct yourselves.

Data access

Clinicians using Dragon Copilot have their own unique accounts. Integration with customer single sign-on systems is available to help enforce user access and authentication. Within Dragon Copilot, Microsoft employee access to customer data is based on the minimum necessary access required to carry out job responsibilities. All regional regulatory controls for strict Customer Data access are in place for AI developers. Dragon Copilot will not share any customer data, nor comply with data subject requests from patients. All events that result in access to Customer Data are logged and reviewed quarterly.

We employ separation of duties between employees who approve access and those who request access through role-based controls standards, where no permanent privileged access is allowed. There is a three-layer hierarchy, which are user accounts, roles, and security groups and associated permissions. The receipt of a service request is submitted with manager approval for access to a system. Access to production systems, the application, and restricted information is on a need-to-know basis using the standard convention of Security Groups Role-Based Access Controls and restricted to a Secure Admin Workstation (SAW) device which is a hardened image from the Microsoft security team. Access is consistent with the concepts of separation or segregation of duties and least privilege. Access is audited monthly to ensure proper access is maintained. 

Dragon Copilot has technical safeguards in place to ensure Customer Data remains safe from unauthorized access or modification. Safeguards include:

Requiring unique usernames and complex passwords to the applications, servers, and networks
Automatic logoff
Emergency access procedures
Role-based standards
Encryption
Vulnerability management processes

Support employees have file view access only, which is used for troubleshooting purposes. They do have the ability to download data to their local computers. Customer data is viewed after signing in to support Jump Hosts which monitor and enforce security rules. The Jump Host is used to connect to the Microsoft Data Center resources with an airgap and extra security layer between the user and the data. The Jump Host is an Azure VM running Remote Desktop Protocol (RDP) for Windows connections and Virtual Network Computing (VNC) for Linux connections. The Azure Portal (first connection) uses single sign-on (SSO) and the RDP session requires multifactor authentication (MFA). 

Human review of data

Human review of data is when a Microsoft employee, subcontractor, or sub-processor accesses customer data. Three groups have managed eyes-on access to Customer Data that contains patient information.

Group one

The customer's systems integrator support personnel may require access in order to resolve customer service tickets. This system integrator may be Microsoft's Customer Success Organization ("CSO") support personnel, if the service is provided directly by the CSO. The CSO may access customer service tickets which contain detailed information about customer inquiries, issues, and feedback. This data may include personal information, descriptions of technical problems, and communication logs between the customer and support agents. Access to this information is necessary to analyze and resolve customer issues effectively, improve service quality, and identify trends or recurring problems that may require broader organizational attention.

Group two

Microsoft Clinical Integrity and AI Research teams engage in activities to build and improve Dragon Copilot AI models, ensuring they meet high standards of clinical accuracy and reliability. One of their primary tasks is to validate and author clinical guidelines, which involves accessing Customer Data to ensure that the AI's recommendations and outputs align with established medical practices and standards. This process includes reviewing real-world clinical scenarios and outcomes to refine the AI output targets.

The Dragon Copilot AI research team may leverage employee and subcontractor annotators to assist with data set creation used for model improvement, development, and evaluation. Annotators access data via Secure Admin Workstations and access is limited by appropriate role permissions which separate access between customer content (PHI or special category data) and end user identifiable information (EUII). Example use cases for which annotators may review data include, but are not limited to:

Transcription of audio: Annotators listen to the recordings and create a verbatim transcription of what has been said. This is used for the training and evaluation of our Automatic Speech Recognition (ASR) models. 
Scribing of audio into a clinical note: Annotators generate a clinical note based on the encounter's audio, transcript and medical specification guidelines created by the Clinical Integrity team. This is used for the training and evaluation of our Note Generation AI.
Evaluation of AI quality: Annotators assess the quality of an AI-generated medical note.
Deidentification: Annotators review audio, transcripts, and notes to manually annotate what constitutes market relevant protected health information (PHI) or special category data that are considered identifiable. 
Generation of orders: If there is a prescription discussed between a doctor and patient, the annotator would use that to generate an order with the information that order needs to have (ICD-10 code, RX code, dosage, etc.). This is used for the training and evaluation of our Order Generation AI.

Group three

Our engineering and research teams may need to access Microsoft Dragon Copilot services which contain Customer Data to ensure the reliable operation of the product. This access is crucial for diagnosing and resolving technical issues, optimizing system performance, and implementing necessary updates. Access requires Secure Admin Workstations, time-based approval, a ticket to justify the need for that access and appropriate role permissions which separate access between customer content (PHI or special category data) and EUII. This access is also geographically based to ensure compliance with local data privacy standards. 

Microsoft Dragon Copilot Data Center	Partner or CSO	Clinical Integrity, Research, and Annotators	Engineering
US	SI: Refer to Partner CSO: US, India	US, EU, Canada, India	US, EU, India
Canada	SI: Refer to Partner CSO: US, India	US, EU, Canada, India	US, EU, India
UK	SI: Refer to Partner CSO: UK, EU	EU	US, EU, India
France	SI: Refer to Partner CSO: EU	EU, UK	US, EU, India
Germany	SI: Refer to Partner CSO: EU	EU	EU

Privacy incident management

Microsoft maintains an Incident Management Policy and notifies its customers of any incident leading to the accidental or unlawful destruction, loss, alteration, unauthorized disclosure of, or access to personal data within the scope of responsibility by any of its staff, sub-processors or any other third party. In the event of a personal data breach, Microsoft has established processes for the identification, mitigation and remediation of the breach to the extent the remediation is within Microsoft's control. Microsoft maintains thorough records of privacy incidents for inspection by a Supervisory Authority.

Microsoft notifies customers and regulatory authorities of data breaches as required. Customer notices are delivered in no more than 72 hours from the time we declared a breach.

Sub-processors

Microsoft uses other organizations to assist in the provision of its products – for example in the hosting of services or the provision of cloud-based tools for support. All of these sub-processors are listed on the learn site. Microsoft does not allow any of these organizations to process personal data without a Data Processing Agreement that subjects the sub-processor contractually to the same data protection obligations that Microsoft has agreed with our customers. All organizations that are contracted by Microsoft to process personal data are subject to Vendor Risk Assessments for security and data privacy practices.