26
Pr ch: A System for Privacy- Preserving Speech Transcription Shimaa Ahmed, Amrita Roy Chowdhury, Kassem Fawaz, and Parmesh Ramanathan USENIX Security Symposium 2020 1

Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Pr𝜀𝜀ch: A System for Privacy-Preserving Speech Transcription

Shimaa Ahmed, Amrita Roy Chowdhury, Kassem Fawaz, and Parmesh Ramanathan

USENIX Security Symposium 2020

1

Page 2: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Speech Transcription*

Speech transcription applications:

2

• Scalable

• Privacy-preserving

• Accurate

* Transcription by IBM Speech-to-Text Demo

Page 3: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Speech Transcription Services

Cloud-based transcription

Deep Speech

+by

3

Open-source offline transcription

Page 4: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Performance Comparison

Standard datasets

4

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

LibriSpeech DAPS TIMIT

WER* (%)

GoogleAWSDeep Speech

* Word-Error-Rate (WER) = "#$#%&𝐷:𝐼:𝑆:𝑁:

# Deletion# Insertion# Substitution# Reference words

Page 5: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Performance Comparison

Real world use-cases• Facebook hearing before the US Senate

• Supreme Court case “Carpenter v. United States”

• VCTK: non-American accent dataset• Speaker p266 of an Irish accent• Speaker p262 of a Scottish accent

5

Page 6: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Standard vs Real World Performance

0.00%5.00%

10.00%15.00%20.00%25.00%30.00%35.00%40.00%45.00%

LibriS

peec

hDAPS

TIMIT

VCTK - p266

VCTK - p262

Faceb

ook1

Faceb

ook2

Faceb

ook3

Carpne

ter1

Carpen

ter2

WER (%)

GoogleAWSDeepSpeech

6

Standard Real World

Off-the-shelf offline transcribers are not reliable for real world applications

Page 7: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Speech is a Rich Source of Sensitive Information

Voice biometrics

• Personal attributes

• Identity

• Impersonation

7technology can clone a speaker’s voice from a short segment of their speech

Page 8: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Textual content

• Sensitive words

• Statistical analysis of the entire transcript• Topic model• Stylometry analysis• Document classification• Sentiment analysis

8

Speech is a Rich Source of Sensitive Information

Page 9: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Utility-Privacy Trade-off

9

Utility

Cloud service providers

Offline service providers

Privacy

Cloud service providers

Offline service providers

Goal: Design an end-to-end transcription system that provides an intermediate solution along the utility-privacy spectrum

Page 10: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

• Obfuscates the users’ voice biometrics

• Protects the sensitive textual content

• Improves on the transcription accuracy compared to offline systems

• Provides control knobs to customize its utility and privacy levels

Original SpeechPrivacy-Preserving Operations

High-accuracy Transcription Final Transcript

De-noising

10

Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription

Page 11: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

• Obfuscates the users’ voice biometrics

• Protects the sensitive textual content

• Improves on the transcription accuracy compared to offline systems

• Provides control knobs to customize its utility and privacy levels

Original SpeechPrivacy-Preserving Operations

High-accuracy Transcription Final Transcript

De-noising

11

Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription

Page 12: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Voice Biometrics

Many-to-One Voice Conversion

Voice Conversion

12

Senator Harris

Senator Thune

Mark Zuckerberg

0% accuracy in matching original speakers with their voice-converted speech using Azure speaker identification API

Page 13: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

• Obfuscates the users’ voice biometrics

• Protects the sensitive textual content

• Improves on the transcription accuracy compared to offline systems

• Provides control knobs to customize its utility and privacy levels

Original SpeechPrivacy-Preserving Operations

High-accuracy Transcription Final Transcript

De-noising

13

Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription

Page 14: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Break the context

• Segmentation

• Sensitive words scrubbing Original speech Segmentation

~3 non-stop words

Sensitive WordsScrubbing

14

Deep Speechtranscription

information about 87million Facebook users being obtained by the company CambridgeAnalytica

Page 15: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Break the context

• Segmentation

• Sensitive words scrubbing Original speech Segmentation

~3 non-stop words

Sensitive WordsScrubbing

15

Deep Speechtranscription

PocketSphinx

information about 87million Facebook users being obtained by the company CambridgeAnalytica

Page 16: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Break the context

• Segmentation

• Sensitive words scrubbing

• The textual content is transferred into a bag-of-words model

Original speech Segmentation

~3 non-stop words

Sensitive WordsScrubbing

16

Deep Speechtranscription

information about 87million Facebook users being obtained by the company CambridgeAnalytica

Page 17: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Differentially Private (DP) Words’ Histogram

• Bag-of-words: histogram of words

• Apply DP to the true histogram• A randomized mechanism 𝐴: ℕ|"| → ℕ|"| satisfies (𝜀, 𝛿)-DP, if for any pair of

histograms 𝐻# and 𝐻$ such that ||𝐻#, 𝐻$||# = 𝑑, and for any set 𝑂 ⊆ ℕ|"|,

Words

Coun

t

17

Pr 𝐴(𝐻# ∈ 𝑂] ≤ 𝑒% Pr 𝐴(𝐻$ ∈ 𝑂] + 𝛿

Page 18: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

DP Challenges

• Pr𝜀𝜀ch has access only to the speech, but not the transcript• No access to the true histogram

• The noise ‘dummy words’ must be added in the speech domain

• The dummy words must be indistinguishable from the true speech• Segment length• Voice• Language model

18

Page 19: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Dummy Segments Generation

The Lanham Act's banon federal registrationof scandaloustrademarks is not arestriction on speechbut a valid conditionon participation in afederal program.

Dummy Text Generation TTS

Domain

Legal, courtcase, violation

Original Speech

Offline transcriber

19

ban on federal registration

DP Mechanism

trademarks is not a restriction

in a federal program

Page 20: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

End-to-End System(1) Hide voice biometrics (2) Ensure noise indistinguishability

Protect the textual content

20

The Lanham Act's banon federal registration ofscandalous trademarksis not a restriction onspeech but a validcondition onparticipation in a federalprogram.

At issue in thiscase is thegovernment'swarrantlesscollection of 127days ofPetitioner's cellsite locationinformation

Original speech

DomainLegal, courtcase, supremecourt

1. Segmentation

3. Dummy Text Generation 4. TTS

5. Voice Conversion

7. Transcription

8. De-noising and de-shuffling

6. Partitioning,adding noise, and shuffling.

government's At on scandalous

cell issue of location7. Transcription

2. Sensitive wordscrubbing

Final Transcript

In Pr𝜀𝜀ch, the DP noise does NOT deteriorate the utility, instead it adds monetary cost overhead

speech not a registration

in condition a federal restriction

valid Petitioner's 127 Lanham on collection

Participation in this program

Page 21: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

• Obfuscates the users’ voice biometrics

• Protects the sensitive textual content

• Improves on the transcription accuracy compared to offline systems

• Provides the users with control knobs to customize its utility and privacy levels

Original SpeechPrivacy-Preserving Operations

High-accuracy Transcription Final Transcript

De-noising

21

Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription

Page 22: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

End-to-End System

22

The Lanham Act's banon federal registration ofscandalous trademarksis not a restriction onspeech but a validcondition onparticipation in a federalprogram.

At issue in thiscase is thegovernment'swarrantlesscollection of 127days ofPetitioner's cellsite locationinformation

Original speech

DomainLegal, courtcase, supremecourt

1. Segmentation

3. Dummy Text Generation 4. TTS

5. Voice Conversion

7. Transcription

8. De-noising and de-shuffling

6. Partitioning,adding noise, and shuffling.

government's At on scandalous

cell issue of location7. Transcription

2. Sensitive wordscrubbing

Final Transcript

speech not a registration

in condition a federal restriction

valid Petitioner's 127 Lanham on collection

Participation in this program

Page 23: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Utility: Transcription Accuracy

Dataset No Voice Conversion

One-to-one Voice Conversion

Many-to-One Voice Conversion

Deep Speech

VCTK p266 5.15 16.55 21.92 26.72VCTK p262 4.53 7.39 10.82 15.97Facebook1 8.26 14.60 20.30 24.72Facebook2 9.75 18.27 19.44 26.61Facebook3 14.93 23.25 27.06 30.72Carpneter1 14.43 23.88 22.63 25.85Carpenter2 13.53 33.71 38.90 39.71

Table: WER(%) at different settings of Pr𝜀𝜀ch vs Deep Speech

23

2% − 32.5%44% − 80%

Page 24: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Textual Content Privacy

Noise Level

Original Word Cloud

24

Page 25: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

Formal Privacy Guarantee

Post-processing of DP:

25

For a speech file 𝑆, Pr𝜀𝜀ch provides perfect voice privacy using many-to-one voice conversion and an (𝜀, 𝛿)-DP guarantee on the word histogram for the domain considered, under the assumption that the dummy segments are indistinguishable from the true segments.

Any statistical analysis on the noisy words’ histogram does not cause loss in privacy

Page 26: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it

TakeawaysPr𝜀𝜀ch as a privacy-preserving speech transcription system:• Provides an improved performance relative to offline transcription• 2% to 32.52% relative improvement in WER

• Obfuscates the speakers’ voice biometrics• 0% accuracy in matching real speakers with their voice-converted speech

• allows only a DP view of the textual content.

26

Contact us: [email protected]