8
Project proposal EECS 452 Fall 2012 1 Title of Project: Das EchtzeitAudiosteganographiesystem Team members: Roy Blankman, Adam Hug, Zhihao Liu, Paul Rigge I. Introduction and overview of project The psychoacoustic model of the human auditory system (HAS) has been developed heavily in the last few years to develop highquality music files and compression [1]. The proposed project would have a DSP device inject data into an audio signal such that when the music was played, a mobile receiver can detect the data through the sound waves but a human cannot. Audio steganography has many applications. There are clandestine applications that would allow spies to communicate secretly, but there are also more mainstream uses of this technology. This technique can be used to embed metadata in audio; to add title, artist, and duration information to FM radio and encode weather or traffic information. Audio steganography can also be used to embed a unique token into music (also called watermarking). This can be used for a Digital Rights Management (DRM) scheme that would help limit piracy. The final prototype that we will produce to demonstrate at the Design Exposition will be a fully functional communications system for hidden communications. Both the transmitter and receiver will consist of a TI DSK DSP with appropriate peripherals. Specifically, the transmitter will use a TRS port as an audio input, a USB port for digital input, and a speaker as output while the receiver will use a microphone as input and write all decoded data to a text file. Our data hiding algorithms will be based on those presented in [1]. We will primarily use frequency masking to hide our data. Our channel and source coding schemes will be based on those described in [4]. We will probably use some type of LempelZiv compression and a convolutional code.

Audio Steganography Project

Embed Size (px)

DESCRIPTION

Audio Steganography Project Document

Citation preview

Project  proposal   EECS  452  Fall  2012  

 

1    

 

 

Title  of  Project:  Das  Echtzeit-­‐Audiosteganographiesystem  

Team  members:  Roy  Blankman,  Adam  Hug,  Zhihao  Liu,  Paul  Rigge  

 

I. Introduction  and  overview  of  project    

The  psycho-­‐acoustic  model  of  the  human  auditory  system  (HAS)  has  been  developed  heavily  in  the  last  few  years  to  develop  high-­‐quality  music  files  and  compression  [1].  The  proposed  project  would  have  a  DSP  device  inject  data  into  an  audio  signal  such  that  when  the  music  was  played,  a  mobile  receiver  can  detect  the  data  through  the  sound  waves  but  a  human  cannot.    Audio  steganography  has  many  applications.  There  are  clandestine  applications  that  would  allow  spies  to  communicate  secretly,  but  there  are  also  more  mainstream  uses  of  this  technology.  This  technique  can  be  used  to  embed  meta-­‐data  in  audio;  to  add  title,  artist,  and  duration  information  to  FM  radio  and  encode  weather  or  traffic  information.  Audio  steganography  can  also  be  used  to  embed  a  unique  token  into  music  (also  called  watermarking).  This  can  be  used  for  a  Digital  Rights  Management  (DRM)  scheme  that  would  help  limit  piracy.    The  final  prototype  that  we  will  produce  to  demonstrate  at  the  Design  Exposition  will  be  a  fully  functional  communications  system  for  hidden  communications.  Both  the  transmitter  and  receiver  will  consist  of  a  TI  DSK  DSP  with  appropriate  peripherals.  Specifically,  the  transmitter  will  use  a  TRS  port  as  an  audio  input,  a  USB  port  for  digital  input,  and  a  speaker  as  output  while  the  receiver  will  use  a  microphone  as  input  and  write  all  decoded  data  to  a  text  file.    Our  data-­‐hiding  algorithms  will  be  based  on  those  presented  in  [1].  We  will  primarily  use  frequency  masking  to  hide  our  data.  Our  channel  and  source  coding  schemes  will  be  based  on  those  described  in  [4].  We  will  probably  use  some  type  of  Lempel-­‐Ziv  compression  and  a  convolutional  code.          

Project  proposal   EECS  452  Fall  2012  

 

2    

II. Description  of  project    

i. The  goal  of  our  project  is  to  create  two  devices,  an  encoder  and  decoder.    The  encoder  will  sample  an  audio  source  from  an  iPod  or  computer,  hide  imperceptible  data  within  those  samples  and  play  the  modified  audio  signal  through  a  speaker.    The  decoder  samples  the  modified  song  and  extracts  the  hidden  message,  using  an  error  correcting  code  to  fix  any  bits  altered  by  the  channel.  For  low  data-­‐rates,  this  is  certainly  feasible.  A  naïve  but  functional  MATLAB  prototype  has  already  been  written.  Some  of  the  more  advanced  techniques  for  improving  bandwidth  or  robustness  may  be  difficult  to  implement  within  the  constraints  of  the  DSP,  but  the  core  idea  is  definitely  achievable.  

 ii. The  system  will  consist  of  three  major  components:  a  transmitter,  channel,  and  receiver.  

The  transmitter  and  receiver  will  consist  of  DSPs  and  the  channel  will  be  a  speaker  and  a  microphone  connected  to  the  transmitter  and  receiver,  respectively.  The  transmitter  has  two  inputs:  audio  and  data.  The  output  consists  of  an  audio  stream  that  contains  both  the  audio  and  data  inputs.    When  played  through  a  speaker,  a  human  will  only  hear  the  input  audio,  but  the  receiver  will  be  able  to  decrypt  the  input  data  embedded  in  the  transmitter’s  output.  Ideally,  the  receiver  analyzes  the  audio  stream  and  extracts  the  same  data  transmitter  initially  embedded.        

 Figure  1.  Audio  Steganography  System  

 The  input  data  is  compressed  and  then  encoded  using  an  error  correcting  code  (perhaps  a  convolutional  code).  The  transmitter  applies  a  windowed  DFT  to  the  input  audio  and  analyzes  the  spectrum  of  fixed  width  time  intervals.  The  data-­‐hiding  block  will  analyze  this  spectrum  to  find  ideal  places  to  hide  binary  data—these  are  referred  to  as  masked  frequencies.    It  will  then  encode  the  data  by  modifying  the  spectrum  such  that  the  change  

Project  proposal   EECS  452  Fall  2012  

 

3    

is  imperceptible  to  the  human  auditory  system.  Once  complete,  an  inverse  DFT  is  taken  and  the  modified  audio  stream  is  sent  to  the  channel.  

 

 Figure  2.  Transmitter  Design  

 The  receiver  applies  a  windowed  DFT  to  the  audio  from  the  channel.  A  detector  identifies  masked  frequencies  and  looks  for  evidence  that  the  transmitter  has  modified  the  spectrum.  The  detector  makes  its  best  guess  as  to  what  data  has  been  hidden  and  then  the  Viterbi  algorithm  is  used  to  correct  errors.  The  data  is  decompressed  and  then  displayed  on  an  LCD  screen.    

 Figure  3.  Receiver  Design  

 However,  there  are  alternate  data-­‐hiding  schemes  our  team  is  considering.  Although  using  frequency  masking  seems  easy  to  implement,  we  are  concerned  that  we  will  not  be  able  to  achieve  a  high  enough  bitrate  using  this  scheme.  According  to  existing  literature,  higher-­‐bandwidth  communications  can  be  achieved  using  spread  spectrum  audio  steganography.  

Project  proposal   EECS  452  Fall  2012  

 

4    

Because  the  mathematics  surrounding  this  scheme  is  complex,  it  might  take  us  too  long  to  write  functional  software  implanting  this  scheme.  Finally,  if  both  of  the  above  methods  take  too  long  to  implement  due  to  their  complexity,  a  last  scheme  we  could  use  is  Echo  Hiding  [2].  This  is  a  very  simple  scheme  that  consists  of  adding  scaled,  delayed  copies  of  a  signal  to  itself  where  a  delay  of  some  time  t_1  encodes  a  1  and  a  delay  of  time  t_0  encodes  a  0.  This  scheme  will  almost  certainly  work,  but  will  unfortunately  create  audible  distortion  in  the  audio  signal;  therefore,  this  would  be  non-­‐ideal  steganography.    iii. Our  team  foresees  possible  complications:  

 a)  DSP  chips  turn  out  to  be  too  slow  to  process  audio  data  in  real-­‐time  

 Solution:  Our  project  involves  mainly  software  components  (i.e.  MATLAB/C  coding),  so  we  could  perform  such  tasks  on  computers  with  more  powerful  processors.  Although  this  would  mean  our  project  would  not  be  running  on  proper  DSP  chips,  we  would  still  be  demonstrating  some  interesting  DSP  applications.  

 b)  Speakers  and  microphones  too  noisy  to  allow  for  reliable  communication    Solution:  Ask  for  an  increased  budget  in  order  to  buy  higher-­‐quality  sensors.  

 c)  Synchronization  more  difficult  than  expected  which  causes  unreliable  communication.        Solution:  We  can  periodically  add  barely  audible  tones  to  help  make  synchronization  easier  for  our  algorithms.  

 d)  Maximum  bitrate  of  communications  system  very  slow    Solution:  Use  a  more  complex  data  hiding  scheme  that  allows  for  increased  bandwidth,  perhaps  by  finding  multiple  masking  frequencies  per  time  interval,  or  by  using  some  spread  spectrum  techniques.    

   

Project  proposal   EECS  452  Fall  2012  

 

5    

iv.   Our  preliminary  parts  list  consists  of  the  following:    1)  TI  C5515  eZDSP  We  will  need  two  such  chips.  The  reference  documents  could  be  found  on  course  website  [5].                                                                              

2)  NADY  SP-­‐4C  Dynamic  Microphone    We  need  a  microphone  to  input  the  audio  signals  into  the  system.  There  are  no  specific  requirements  on  the  microphone,  so  we  could  just  purchase  at  [6].  The  total  cost  for  this  item  will  be  $  15.99.  

3)  Logitech  LS21  2.1  Stereo  Speaker  System    We  need  a  speaker  to  output  the  detected  and  decoded  audio  information  on  the  user’s  end  of  the  system.  We  could  purchase  one  at  [7].  The  total  cost  for  this  item  will  be  $25.69.    

III. Milestones      a. Milestone  1:  Finish  programming  data  hiding,  data  recovery  and  coding  

software.    This  milestone  consists  of  devising  and  implementing  an  algorithm  for  source  coding  (compression),  channel  coding  (assuming  our  channel  will  behave  as  a  binary  symmetric  channel  (BSC)  or  binary  erasure  channel  (BEC)),  data  hiding,  data  recovery,  and  data  decoding.  Everything  must  first  be  written  in  MATLAB,  and  then  ported  to  C  for  use  on  the  DSP’s.  This  milestone  will  rely  on  our  team  effectively  splitting  up  the  programming  so  that  all  the  code  can  be  finished  on  time.  We  aim  to  at  finish  the  MATLAB  coding  by  the  time  school  resumes  after  spring  break.    It  should  not  take  more  than  a  week  to  port  the  algorithms  over  to  C.  

     b. Milestone  2:  Integrate  software  with  hardware  and  demonstrate  reliable  data  

transfer.  All  the  C  code  written  for  milestone  1  will  be  placed  on  our  two  DSP’s  and  any  functions  such  as  the  DFT  (which  have  optimized  implementations  for  the  DSP)  must  be  adjusted  to  function  properly  with  the  hardware.  Next,  we  will  have  to  adjust  the  code  to  use  microphones,  speakers  and  a  serial  line  as  inputs  instead  of  stored  digital  data.  Finally,  we  will  test  the  system  in  the  EECS  452  lab  to  

Project  proposal   EECS  452  Fall  2012  

 

6    

ensure  that  it  is  a  viable  real-­‐time  communications  system.    We  aim  to  finish  all  the  integration  before  the  Design  Exposition.    

c. A  potential  issue  for  milestone  1  might  be  discovering  that  devising  a  data-­‐hiding  algorithm  is  more  mathematically  complex  than  we  expected,  and  thus  the  development  of  such  an  algorithm  might  take  longer  than  is  required  by  our  project.    A  potential  issue  for  milestone  2  would  be  discovering  that  our  chosen  hardware,  especially  the  speakers  and  microphones,  introduce  excessive  noise  that  precludes  reliable  communications.  Additionally,  it  is  possible  that  our  algorithm  implementation  would  be  too  complex  to  be  performed  in  real-­‐time  on  our  hardware.    

IV. Contributions  of  each  member  of  team  

Audio  steganography  can  be  implemented  in  many  ways.    At  the  start  of  the  project,  Adam  and  Zhihao  will  experiment  with  an  Echo  Hiding  scheme  while  Roy  and  Paul  experiment  with  a  frequency  masking  scheme.    Both  implementations  will  initially  be  coded  in  MATLAB  to  see  which  one  (or  both)  is  feasible.    Benchmarks  that  measure  feasibility  will  include  bit  error  rate,  bit  transfer  rate,  and  how  perceptible  the  modification  is  to  a  human  observer.  

If  both  turn  out  to  be  feasible,  then  it  is  possible  to  combine  the  two  schemes  to  minimize  bit  error  rate.    Otherwise,  the  more  robust  implementation  will  be  chosen.    

Roy  and  Paul  have  a  strong  background  in  communications.    As  such,  they  will  primarily  work  on  the  data  decoding  process.    They  will  transfer  the  decoding  algorithms  from  MATLAB  to  C.    These  algorithms  will  extract  information  hidden  in  sound  files.    When  this  is  finished,  they  will  modify  their  code  so  it  works  on  the  C5515  DSP  chip.    All  decoding  processes  will  be  done  in  real  time.  

Zhihao  and  Adam  both  have  a  strong  background  in  DSP.    They  will  work  on  the  data  encoding  process.    This  involves  transferring  encoding  algorithms  from  MATLAB  to  C.    These  algorithms  will  listen  to  a  sound  file  and  add  inaudible  modifications.    The  hardware  will  utilize  the  C5515  DSP  chip  and  all  encoding  processes  will  be  done  in  real  time.  

 

Project  proposal   EECS  452  Fall  2012  

 

7    

V. Logistics  a.  Meeting  Schedule  

i. Our  team  will  meet  every  Wednesday  afternoon  to  lay  out  weekly  goals  ii. Each  team  member  will  be  expected  to  send  out  emails  on  Friday  and  

Monday  detailing  their  progress  in  our  project  thread    

                                       b.        Version  control  system                                                            Since  our  project  requires  lots  of  coding,  it  is  convenient  to  set  up  a                                                            version  control  system  so  that  each  of  us  can  read  and  write  our  code  easily.                                                            Thus,  we  set  up  a  Git  repository  stored  in  a  private  AFS  space.      

             c.        Planned  demonstration  for  design  expo  Currently,  we  plan  to  have  a  working  communications  system  to  display  at  the  design  expo.  Such  a  system  would  consist  of  an  audio  source  piped  through  a  DSP  that  is  embedding  data  in  the  waveform  all  in  real-­‐time  and  then  being  played  over  a  loudspeaker.  There  would  then  be  another  DSP  taking  input  from  a  microphone  that  would  analyze  the  audio  it  receives  in  order  to  recover  the  hidden  signal.  Observers  will  be  allowed  to  enter  in  data  by  typing  into  a  keyboard  and  see  their  input  transmitted  inaudibly  over  our  communications  system;  the  final  message  will  be  printed  to  an  LCD  screen  attached  to  the  receiver’s  DSP.    

Project  proposal   EECS  452  Fall  2012  

 

8    

 

VI. References  and  citations  

[1]  Nedeljko  Cvejic.  Algorithms  for  Audio  Watermarking  and  Steganography.  [2]  Daneil  Gruhl,  Walter  Bender.  Echo  Hiding.  Elsevier.  [3]  Stéphane  Mallat.  A  Wavelet  Tour  of  Signal  Processing.  [4]  David  J.  C.  MacKay.  Information  Theory,  Inference,  and  Learning  Algorithms.  Cambridge  University  Press,  2003.  [5]  http://www.eecs.umich.edu/courses/eecs452/refs.html  [6]  http://www.amazon.com/Nady-­‐SP-­‐4C-­‐NADY-­‐Dynamic-­‐Microphone/dp/B00009W40D/ref=sr_1_5?s=musical-­‐instruments&ie=UTF8&qid=1328468297&sr=1-­‐5  [7]  http://www.amazon.com/Logitech-­‐LS21-­‐Stereo-­‐Speaker-­‐System/dp/B0015C30J0/ref=sr_1_12?s=aht&ie=UTF8&qid=1328468465&sr=1-­‐12