Scalar Quantization for Audio Data Coding

1  Abstract — Th is p aper is concerned w ith scalar quantization of transfo rm coefficien ts in a n au dio co dec. Th e generalized G aussian distribu tion (GGD) is used as an a pproxim ation of one-d im ensiona l prob ability density fun ction fo r tran sform coefficients obtain ed by m odula ted lapped transform (MLT) or m odified cosine tra nsform (MDCT) filterban k . Th e rationale of the mod el is provided in com pa rison w ith theore tically achievab le rate-distor tion fu nction. The rate-distortion fun ction com puted for the ran dom sequenc e obtain ed from a real sequ ence of sam ples from a large data base is com pared w ith that co mpute d for random seque nce obtain ed by a GG D rand om generator. A sim ple algorithm of con structing the Extended Zero Zon e (EZZ ) quan tizer is pro posed . Simula tion results show that the EZZ quan tizer yields a ne gligible loss in terms of codin g efficiency com pared to optim al scalar quantizers. Furtherm ore, we describ e an adaptive version of the EZZ quantizer w hich w orks efficiently with low bitrate req uirem ents for transm itting side inform ation. Index Terms — Adaptive lossy codin g , audio data, generalized Gaussia n distribu tion, no n-unifor m qua ntization , scalar quan tization, uniform quantiza tion. I. I NT RODUCTI ON ost of p erceptual audio codecs are based on the cosine-m odulated filterbanks. Previous researches show that high trans form gain s can be ob tained from such filterbanks with reasonable implem entation co mplexity [1] – [3] . Ty p ically, psychoacoustic module in perceptual audio codec is used to optimize the allocation of bit resource across subbands of an audio signal. Psychoacoustic module takes into account human ear sensitivity to detect distortions of an original sound, depending on frequency range, amplitude, neighboring (in frequency or tim e d omain) sounds, and so on. Critically sampled filterbank for each frame of N time-dom ain samples outputs N f r equency-dom ain spectrum coefficients that are to be quantized and lossless e ncoded. The quantization module is one o f the most important modules of the audio codec. The quantizer receives the transform coefficients from the filterbank and the required quantization precision from the PAM m o dule (see Fig.1). Manuscript received March,26, 200 8 . B. D. Kudryashov an d A. V. Porov are with the St. -Petersburg State University o f Information Technolog y, Mechanics and Optics, 1 97101, St.-Petersburg, Russia (e-mail: boris@eit.lth.se , PorovA nt on@yandex.ru). Eunmi Oh is with th e Samsung Advanced Instit ute of Technolog y, Suwon , 440 -600, Korea (e -mail: sait@samsung.com ). The output of the quantizer is further processed by a lossless entropy encoder. For example, pr operly designed arithmetic encoder, as one of entrop y encoder s, can provide the same coding rate as entropy of the q uantized data. T herefore, the entropy of quantized data can be considered as criteria f o r quantization analysis. The goal of this p aper is to d evelop a quantizer that produces output data with a m inimum entro py value under a given restriction on the quantization precision. In general, there are tw o com peting classes of quantizers: scalar and vector quantizers [4], [5 ]. W ithin each class we can also choose among different types of quantizers. The o ptimal choice, of co urse, depends on the range of bit r ates and on the source m od el. How ever, there are som e important practical limitations which make scalar quantization more favorable. The first is coding and reconstruction co mplexity . The coding gain of vector q uantization over scalar quantization grow s slowly with the quantization d imen sion, whereas mem or y consumption and computational complexity grows exponentially [4], [5] . One more argument in favor of scalar q uantization is related to the ad aptation o f a q uantizer to dynamic changes in source data prob ability distribution. Typically, the co deboo k of vector quantization is constru cted from som e training data set of audio signals. T his codeboo k mig ht be in efficient for aud io signals , because we can have a very unusu al input that is quite different from data set us ed in th e construction of the code book . This mig ht not happen for scalar quantizer since each component is processed separ ately. Yet another (and prob ably the most important one for low - rate audio cod ecs) argumen t is more sop histicated. It takes into account specific properties of the generalized Gaussian distribution (GGD) rando m variables. We will show in Section III that the p otential coding gain o f vector quantizati on heavily depends on bit rate and the parameters o f GGD, or more general, on “ tails” of distribution. The w ell - know n estimate of vector q uantization gain o f 1.5 4 dB (for the MSE as a distortion Scalar Quantization for Audio Data Coding Boris D. Kudryashov, Anton V. Porov, and Eunmi L. Oh M Fig. 1. Quantization module for audio. 2 criterion) is valid under the assumption of high r ates or small distortions (see [9], [1 0]). It ap pears that for distributions with heavy tails the distortions b ecome “small” for m uch higher rates than f o r, say, Gaussian distribution. We will show in Section III that for GGD with small value of the par ameter  for cod ing rates belo w 1 b it per sample th e rate -distortion function of the scalar q uantization is very close to th at of the vector quantization, i.e. to the theoretical limit. This phenomenon m akes the potential gain of vector quantization rather small and vector q uantization lo ses against scalar quantization in terms of distortion/com plexity tradeoff. We would co nsider closeness to theoretical limits o n quantization performance as a figure of merit of concrete quantization scheme. For stationary rando m pro cess such theoretical limits are defined by Shannon rate-distortion function. There are several o bstacles when using this approach in real app lications. T hey are: non-stationarity of real signals, complexity of their math ematical models, and th e absence of analytical expressions for rate -distortion functions for most probability density fu nctions. Our ap proach to surmount these problems is s imilar to that used in universal lossless source coding [13]. We split spectral coefficients into subbands , assum ing that the sp ectrum coefficients are stationary in subbands. For each subband, we estimate the param eters of GGD model, as it is illu strated in Section II . Parameters are estimated based on moment ’ s method. Then, we construct a q uantizer for the source with GGD using these estim ates instead of the unknow n true values of parameters. In the theory of univ ersal lossless source coding it is proven that similar strategy provides the rate arbitrarily close to the entrop y of the source [13]. T he red undancy of universal coding per encoded letter is propo rtional to   n n / l o g 2 and vanishes with the increasing length n of the sequence used for estimatin g the unknow n parameters. This redundancy value i s interpreted as th e cost of side in formation. T he examples o f side inform ation are scale index, scale factor, etc that can strongly depend on type of quantizer . For the GGD sources we show that a near optimum scalar quantizer can be found among uniform q uantizers or extended zero-zone (EZZ) quantizers (the efficiency o f optimum scalar quantization and uniform scalar quantization for GGD variables was studied in [6]). T his means that not much side inform ation about the quantizer needs to be transmitted for a given su b band of a frame of the encoded data: only the quantization step and the relative width of the zero zo ne. Thereby, like in universal data compression, the cost of side inform ation is relativ ely small, even for rather short quantized sequences. The rest of the paper is organized as follows. In Section II the source m od el is considered. I n Sectio n III w e consider ty p es of s calar quantizers and study their per formance. Section IV is devoted to EZZ quantizers. Adaptation of quantizer parameters to changing in put data distribution is studied in Section V. II. S OURCE MODEL We consider the sequence o f spectrum coefficients of each separate spectrum subband as a stationary sequence of independent identically distributed random variables. Thus the source model is fully described by a o ne-dimensional probability density fu nction. In m ultimedia applications like video- and audio- data coding, th e g eneralized Gaussian distribution (GGD) is often used as a source model. T he corresp onding probability density function has th e form           , , ex p / 1 2 , ) (         m x x f     wh er e m , σ are mathem atical expectation and standard deviation of the random variable, α is the p arameter, ()  is the Gamma function   , 0 , 0 1        x dt e t x t x and       . / 1 / 3 , 2 / 1 1                 Plots of f ( x ) are shown in Fig.2. Special cases of GGD are Gaussian distribution ( α = 2), Laplacian distribution ( α = 1) and uniform d istribution ( α→∞ ). The information-theoretical rate-distortion function R ( D ) [7] for a memory less discrete -tim e stationary r andom p rocess is defined as   ), ; ( m i n ) ( ) , ( : ) | ( Y X I D R D y x d E x y f   wh er e X and Y are the source alphabet and the appro xim ation alphabet, respectively,    X Y dxdy x f x y f x y f x f Y X I ) ( ) | ( l og ) | ( ) ( ) ; ( 2 is mean mutu al information between X and Y , and d ( x , y ) is a nonnegative fun ction which is called distortion measu r e. We consider the mean squared error ( MSE)   2 ) , ( y x y x d   as a fidelity criterion. The rate-distortio n function R ( D ) determines the least achievable bit rate R = R ( D ) under restriction that the mean distortion measu r e does not ex ceed D . We explo it num er ical Fig. 2. Ge neralized G au ssian distrib ution. 3 method [8] for computing R ( D ). To verify wheth er the GGD -model is ap propriate f o r audio coding, we have done the following experiments. We split the MDCT spectrum coefficients of audio signals into subbands according to Bark scale and for each subband we generated a long data seq uence ob tained b y ap plying an MDCT -based filterbank to a large bank of audio fragm ents. T hen for each subband x 1 ,…,x n th e param eters of GGD w ere estimated. We assum ed average value m = 0, and estimated variance as    n i i x n 1 2 2 1 ˆ  , and the first absolute mom ent as       n i i x n x E 1 1 ˆ  . The estimated parameter  ˆ of GGD is computed as a solution of the equation            / 2 / 3 / 1 ˆ ˆ 2 2 2     . (1) The Blahut algorithm [8 ] w as used to compu te two rate-distortion functions for each subband: 1) “Theor etical” function R T ( D ) was co mputed for discrete alphabet X ˆ obtained b y fine quantization of X with probabilities of X x ˆ  found using GGD with α determined as a solution of (1). 2) “Empirical” function R E ( D ) was computed for d iscrete alphabet X ˆ obtained b y fine quantization of X with probabilities of X x ˆ  found as estimated prob abilities directly from the real sample data sequence. One ty p ical exam ple of these two fu nctions is given in Fig. 3 for estimated GGD parameters α = 0.6 7 and σ 2 = 1. It is clear from the f igure that the tw o curv es are almost in distinguish able, wh ich confirms that GGD is a good mathem atical m odel for MDCT spectrum coefficients. III. U NIF ORM , OPTI MAL UN IF ORM AND NON - UNI FORM SCAL AR QUA NTI ZER We will start w ith up per bound s on the rate-distortion function f or scalar quantization. Most o f the estim ates of q uantization p erformance are derived for so- called “hig h resolution” quantization. Under the assum p tion that the number o f quantization levels is so larg e that the pro bability density function is alm ost uniform at each quantization step, the followin g estimate of the scal ar quantization rate-distortio n function R S ( D ) was obtained by Koshelev [9] and later by Gish and Pierce [1 0]: bits D R e D R D R SH SH S 25 46 . 0 ) ( 6 l o g 2 1 ) ( ) ( 2      , (2) wh er e R SH ( D ) is the lower Shannon bound on rate -distortion function, w hich can be written in the f orm   ) ( 2 l o g 2 1 ) ( ) ( 2 0 D R eD X H D R SH     . ( 3) In th e above formu la 0 () HX is th e diff erential entropy , wh ich for GGD equals     2 ln 1 / 1 2 , l og ) ( 2 0                X H . Achievable scalar quantization performance w as thoroughly investigated b y Far vardin and Mo destino in [6], using Lagrangian minim izatio n of the MSE, under restriction on the bit rate ( more exactly, on the entro py of the appro xim ation alphabet) over all quantizer param eters, for different n umber of quantization levels. It follows from [6] that o ptimu m uniform quantization performan ces almost coincide w ith optimum quantization p erformance for broad class of probability distributions. This is not surprising, since it was estab lished analytically by T. Berger [1 1] that for GGD w ith α =1 (Laplacian distribution) optimum uniform quantization is entropy-optimal. Rate-distortion functions for four generalized Gaussian distributions are show n in Figs. 4  7 together with performance of scalar quantizers. In each fig ure, R ( D ) denotes the rate-distortion functions obtained using Blahut algorithm; R SH ( D ) denotes Shannon bound (3), and the dotted line shows the Koshelev b ound (2). Curves denoted by R USQ ( D ) and R OUSQ ( D ) repr esent performance of un iform scalar quantization (USQ) and optimal uniform scalar q uantization (OUSQ), respectively. Here, USQ implies that quantization procedure is performed as divis ion o f input data b y the f ixed quan tization step (proper ly chosen to provide required bit rate) follow ed by rounding. Middle points of quantization intervals are used as the reconstruction values. The OUSQ differs from USQ in the reconstruction valu es wh ich are com puted as mass centers of quanta. W e do not show R S ( D ) since it is indistinguish ab le from R OUSQ ( D ). Fig. 3. Theore ti cal and empirical rate-d istortion functions. 4 It is easy to see from th e plots that th e behavior of the rate-distortion fu nctions for small α differs sig nificantly from their behavior for Laplacian ( α =1 ) and Gaussian distribution ( α =2 ). In particular, for bit rates below 1 bit p er sample scalar quantization is closer to theoretical minim um R ( D ) for small α than for large α . For example, the di stortion level of 10 -2 (20 dB) for Gaussian distribution ( α = 2) can b e ac hieved at bit rate o f approximately 3.33 bits/sample usin g vector q uantization, or at rate of 3. 58 bits/sample with optimal scalar quan tization. The same distortion level for GGD wit h α = 0.25 can be achieved at 1.50 bits/sample using vector quantization and at rate o f 1.61 bits/sample using scalar quantization. T herefore, theoretically achievable gain of vector quantization 0.25 bits/sample ca nnot be achieved for typical sequences of transform coefficients of audio signals. Althoug h uniform quantization itself can be easily implem ented, the reco nstruction is not so simple, because it requires storing rec onstruction values for all quantization intervals. It co uld b e po ssible to keep a full set o f reconstruction levels if the bit rate is fixed and input signal is a stationary process. However, neither of th e tw o co nditions is valid in audio coding, and thus, using optimal u niform quantization b ecame too co mplicated. We have found a much sim p ler solution that yields near-optim al scalar quantization. IV. E ZZ SCALA R QUA NTIZ ER In this sectio n w e p resent sim ulation results for quantizer with the extended zero zone (EZZ). The set of quantization thresholds can be described b y the f ollowing set of num b ers:          , 2 2 , 1 2 , 2 , 1 1 1          j j j j B     , (4) wh er e λ >0 is th e scaling f actor and j is the parameter w hich determines the size of the zero zone,    , 1 , 0  j . It is clear f r om (4) that all thresholds are equally spaced w ith interval λ except the tw o th r esholds ±λ 2 j -1 w hose distance is λ 2 j . This value is the size of th e zero zone. Ob viously , th e scale B (0 ,λ ) correspo nds to the uniform quantization. Examples of the scales B ( j,λ = 1 ) are shown in Fig . 8. Now let us choose the set of approximating v alues. We consider the following sets of quan tizers: 1) EZZ, with scales defined by (4) and approximating values placed in the middle of each quantization interval; 2) OEZZ (Optimized EZZ) with scales defined by (4 ) and optimal ap proximating values p laced into gravity mass center of the probab ility density function of each quantization interval; 3) SOEZZ (Sub-optimal EZZ) with scales defined by ( 4) and Fig. 5. Rate-distortion funct ion for α = 0.5 . Fig. 4. Rate-distortion funct ion for α = 0.2 5. Fig. 6. Rate-distortion funct ion for α = 1.0 . Fig. 7. Rate-distortion funct ion for α = 2.0 . Fig. 8. Examples of quantization scales. 5 optimal app roximating values for 2 intervals closest to the zero interval and all o ther ap proximating values p laced in the middle points of the corresponding intervals. For a fixed bit rate R the corresponding distortion levels are related as ) ( ) ( ) ( R D R D R D OE ZZ S OE ZZ EZZ   (5) wh ile the complexities and the amount of side inf ormation required for describing these quantizers are related in the oppo site manner. Our goal is to estimate the gap b etween distortion values in (5) for the GGD random variables. The qu antization ga in is defined as ) ( l o g 10 2 10 dB D G   , (6) wh er e σ 2 is the source data variance and D is quantization error variance. In (6 ) without loss of generality we can set σ 2 = 1. T hen the maxim um achievable gain G max ( R ) under a fixed bit rate R can be computed as 0 10 m ax l og 10 ) ( D R G   , wh er e D 0 is the solution of the equation R D R  ) ( 0 and R ( D ) is the rate -distortion function. If ) ( l o g 10 ) ( 10 R D R G   denotes the quantization gain of s ome quantizer th en we call the difference ) ( ) ( ) ( m ax R G R G R L   the “ loss of coding ga in with respect to the theoretical limit ”. Plots of L ( R ) for different quantizers and for different values of parameter α are shown in Fig 9. For α = 1 and α = 2 these plots are obtained numerically and for smaller α they ar e obtained by simu lation. It follow s from these plots th at SOEZZ is very close to optimal scalar quantization for all distributions and especially for small α w hich are typical for spectrum quantization problem. Note also that th e gain of SOEZZ w ith respect to un iform quantization is rather high. In particular, for small α this gain approaches 0.5 dB, while for large α the gain achieves 1 dB. Notice again that the gain loss with respect to vector quantization is near 0 .5 dB only for b it rates about 1 bit/sam ple. The efficien cy of know n vector q uantization and trellis quantization schemes with reasonable co mplexities is also roughly 0.5 dB below the theoretical lim it. (a) (b) (c) (d) Fig. 9. Loss of coding gain with respect to theore tical limit. (a) GGD distribut ion w ith parameter α = 0 .25, (b) GGD d istribution w ith parameter α =0.5, (c) GGD distribution with parameter α = 1 .0, (d) GGD dist ribution with parameter α = 2.0 6 V. A DAPT I VE SCAL AR QUA NTI ZATI ON It follows f rom the above considerations that near-op timal quantization performance can be achieved using SOEZZ which , according to (4), can b e completely described by 3 parameters: quantization step λ , zero zone w idth j , and approximating value a 1 for the first non-zero quantization interval. T he EZZ quantizer has sl ightly worse results than SOEZZ an d can be described by two param eter s: q uantization step λ , and zero zo ne width j . Bo th EZZ and SOEZZ quantizer can be eff iciently used for practical implemen tation. The OEZZ quantizer has best results, but it is not u sed since the num b er o f of parameters to be transmitted to decoder as side information is large (as many as num b er of quanta in a scale). The following approac h can be used for estimating the EZZ, SOEZZ or OEZZ quantizer parameters. The rate -distortion functions for typical values o f α for t he EZZ, SOEZZ or OEZZ quan tizer h ave to be known to th e encoder. T hese functions can be kept in the form of d ata arrays or as sim ple approximate analytical expressions (e. g. interpolation polyn o mials). A typical function of SOEZ Z is show n in Fig. 1 0 for α = 0.5 , σ = 1 . These functions for EZ Z and OEZZ can diff er only in regions of scale index using. No tice that for each point ( R,D ) of the ra te -distortion f unction th e optimal parameters ( λ,j ) are known to the encoder. Let ( x 1 ,…,x n ) be the data sequence to be quantized. The adaptive quantization p rocedure for EZZ quantizer is shown in Fig.11, and can be modified to use with any type of extended zero-zone quantizer. For SOEZZ, we need additional estim ate reconstruction value o f first non-zero quantum. For OEZZ quantizer, we need estimate reconstruction value of all non-zero quanta . Procedure starts with estimatin g GGD parameters for wh ich we us e the same approach as in [12]. First, the variance    n i i x n 1 2 2 1 ˆ  and the first absolute mom ent       n i i x n x E 1 1 ˆ  estimates are computed. T he estimated parameter  ˆ of GGD is found as a solution of the equation            / 2 / 3 / 1 ˆ ˆ 2 2 2     . To compute the q uantizer parameters the required distortion level D o btained from PAM module must be normalized by 2 ˆ  , and the function R ( D ) corresponding to the estimated  ˆ has to be used for evaluating j and λ . If SOEZZ quantizer is considered then one more calculation is need ed . A fter quantization, th e reco nstruction lev el a 1 can be computed as follow s    ] , [ | :| 1 1 2 1 1 b b x i i i x n a , wh er e the sum is computed over all i such that | x i | belong to the first non-zero quantization in ter val and n 1 is th e nu mber of elements in this sum . If OEZZ quantizer is considered then following calculation s are req uired . After quantization, the reco nstruction level of each non-zero quantum a j can be computed as follows 1 :| | [ , ] 1 i j j ji i x b b j ax n     wh e re the sum is computed over all i such that | x i | belong to the j - th non-zero quantization interval and n j is the num ber of elements in this sum . VI. C ON CL USION S We analyzed ef ficiency of scalar quantization follow ed by entropy coding, used f or encoding filterbank o utputs of an audio codecs. The GGD is used as a model of one -dimensional Fig. 10. Parameters of the SOEZZ quantizer. Fig. 11. Adaptive quantization. 7 probability distribution model of the data to be q uantized. Using the Blahut algorithm for evaluating the rate -distortion function we have show n that the theoretically achievable efficiency computed from the m o del virtu ally coincides with the em p irical rate-distortion fu nction obtained directly from the lon g audio data sequence. Thereby we justify the cho ice of the GGD as a source model. The potential efficiency of scalar quantization was estimated and compared with vector quantization efficiency . For typical audio data the gain of vector quantization over scalar quantization is rather small. Mor eover, the efficiency of the uniform scalar quantization and the extended zer o -zone (EZZ) quantization is close to that of optimum scalar quantization. The im p ortant advantag e of un iform and EZZ quantization is that they can be described by a small nu mber of par ameters. Therefore EZZ quantization used for adaptive quantization d oes not require transmitting a large amount of side inf ormation. R EFERENCES [1] J. P. Princen and A.B. Bradley , “Analy sis/ synthesis f ilterbank design based o n time domain aliasing can cellation,” IEEE Tran s. Aco ust., Speech S ignal Processing, vol. A SSP-34, pp . 1153 -1161, Oct., 1986 . [2] H . S. Malvar, Signa l processin g wit h la pped transforms. Artech House., 1992. [3] M. Teme rinec and B. Edler, “LI NC: A common theory of tran sform and subband coding,” IEEE Trans. Commun., v ol. COM-41, 2, pp. 266-274 , Feb. 1993. [4] R.M. Gray and D.L. Neuhoff, “Quantization,” IEEE Trans. Info rm. Theory , vol. IT -44 , No 6, pp. 2325  23 83, Oct. 1 998. [5] T . B erge r a nd J. Gibson, “Lossy source codin g,” IEEE Trans. In form. Theory , v. IT-44, No 6, pp. 2 702-2 703, Oct, 1 998. [6] N . F arvardin and J. W. Modestino, “Optimum quant izer perf ormance for a class of non- Gaussian memory less So urces,” IEEE Trans. Inform. Theory , v. IT-30, No 3, pp. 485 - 4 97, May, 1984 . [7] J. A. Thomas a nd T. M . Cover , Elements of Informat ion Theory, John Wile y& Sons, New Y ork, 1981. [8] R. E. Blahut , “Computation o f Channel Capaci ty and Rate -Distortion Functions”, IEEE Trans. Inform. The ory , I T-18, No 4, pp. 460 -473, Jul., 1972. [9] V . N. Koshele v, “Quantization with minimum entropy,” Probl. Inform. Transmiss. v.14, pp. 151-15 6, 1963 (Section X). [10] H. G i sh and J.N. Pierce, “Asy mpt otically efficient quanti zing,” IEEE Trans. Inform. Theory, v.I T-14, No 5, pp. 67 6-68 3, Sept., 1968 . [11] T. Berger, “Optimum quantizers a nd permutation codes,” IEEE Trans. Inform. Theory, v. IT -18 , No 6, pp. 75 9-76 5, Nov. 1972. [12] K . Sharifi K. and A. Le on - Garcia, “Estimation of shape parame t er for general ized Gaussian d istributions in subba nd decompositions of video,” IEEE Trans. on C ircuits an d Systems f or Video Tech nology , 5(1), pp. 5 2-56, Febr. 1995. [13] L . D. Davisson, “Universal noiseless coding,” IEEE Tran s. Info rm. Theory, v. IT -19 , No 6, pp. 78 3- 79 5, Nov. 197 3. Boris D. Kudryashov was born in L eningrad, USS R (now St.-Petersburg, Russia) on July, 9 , 195 2. He receive d the Diploma degre e in e lectrical e ngineering in 1974, t he Ph.D. degree i n techni cal sciences degree in 1978 b oth from the Le ningra d I nst itute of A erospace I nstrumentation (L IA P) and the Doctor of Science degree f rom the I n stitute of Information Transmission Problems ( I PPI ), M oscow in 200 5. Since 1978 he w a s bee n first A ssistant Professo r and then Associate Professor and Professo r a t t he Sta te University of Aero space Instrumentation (forme r LI AP), St.-Petersburg, Russia. Since 20 07 he i s Professor of St. -Petersburg State University of I nformation Technology , Mechanics and Optics, St. -Petersburg, Ru ssia. His research interests includ e codi ng th eory , information theory and applications to speech, audio, an d image codin g. He ha s published more than 70 papers in j ournals and proceedings o f internati onal confe rences, 15 US patents and published pat ent applications i n image, speech and aud io coding. Prof. K u dryashov se rved as a member of Organizing Committees of ACCT I nternational Workshops. Anton V. Porov was born in Le n ingrad, USSR (now St.-Petersburg, Russia) on Jan uary, 12, 1 980. He receive d the Diploma degree in Computer science 20 03. Since 2003 he was been an En gineer at State University of Aerospace Instrumentation (GUA P), St.-Petersburg, Russia. In 2005 - 2007 he wa s me mber of Assistant Sta ff of Samsung Advanced Institute of Technolo gy (SAI T), Suwon-si, South Korea. Sin ce 2008 he is a rese arch engineer at St. -Petersburg State University of Info rmation Te chnology , Mecha nics and Optics, St.-Pe tersburg, Russia. His re search interests i nclude coding theor y, information theory and application to speech and audio codin g. Eunmi L. Oh receive d the Ph.D degree in p sycholo gy from the University of Wisconsin – Madison, WI , USA in 1 997. Her major was psychoacoustics focusing on masking models. She i s c urrently working at Samsu ng Advanced I nstitut e of Technology , Yongin, South Kor ea. Her recent work is concerne d with perceptual coding and sca lable audio coding.

Scalar Quantization for Audio Data Coding

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment