
Publications
(A): International Refereed Journal Papers
-
S. V. Bharath Kumar and S. Umesh [2008]: ``Non-Uniform Speaker Normalization Using Affine Transformation,'' To Appear in Journal of the Acoustical Society of America, Vol. 124, No. 3, Sep. 2008
-
R. Sinha and S. Umesh [2008]: ``A Shift based Approach to Speaker Normalization using Non-Linear Frequency-Scaling Model,'' ISCA Transactions on Speech Communication, Vol. 50,No. 3, pp.191-202, Mar. 2008
-
S. Umesh and R. Sinha [2007]: ``A Study of Filter-Bank Smoothing in MFCC Features for Recogniti on of Children Speech,'' IEEE Transactions on Audio, Speech and Language Processing, Volume 15, Issue 8, Nov. 2007 Page(s): 2418 – 2430
-
S. Umesh, L. Cohen and D. Nelson [2007]: `` Fluctuations in Speech'', Fluctuations and Noise Letters, 1.Vol. 7, No. 3, Sep. 2007, pp. 215—224
-
S. Umesh, L. Cohen and D. Nelson [2002]: ``The Speech Scale,'' Acoustics Research Letters Online of the Journal of Acoustical Society of America, Vol. 3, Issue 3, pp.83-88, July 2002.
-
S. Umesh, L. Cohen and D. Nelson [2002]: ``Frequency Warping and the Mel-scale,'' IEEE Signal Processing Letters, vol. 9, no. 3, pp.104-107, March 2002.
-
S. Umesh, L. Cohen, N. Marinovic, and D. J. Nelson [1999]: ``Scale-Transform in Speech Analysis,'' IEEE Transactions on Speech and Audio Processing, vol. 7, no. 1, pp.40-45, Jan. 1999.
-
S. Umesh and D. W. Tufts [1996]: ``Estimation of Parameters of Multiple Exponentially Damped Sinusoids using Fast Maximum Likelihood Estimation with Application to NMR Spectroscopy Data,'' IEEE Trans. Signal Processing, vol. 44, no. 9, pp.2245-2259, Sept. 1996.
-
D. W. Tufts, H. Ge, and S. Umesh [1993]: ``Fast Maximum Likelihood Estimation of Signal Parameters using the Shape of the Compressed Likelihood Function,'' IEEE Journal of Oceanic Engg., Vol. 18, no. 4, pp. 388-400, Oct. 1993.
(Invited Paper).
(B): Refereed Papers in International Conferences held outside India
-
D. R. Sanand and S. Umesh [2008]: ``Study of Jacobian Compensation Using Linear Transformation of Conventional MFCC for VTLN'', To Appear in Interspeech-2008, Brisbane, Sep. 2008
-
D. R. Sanand, V. Balaji, R. Sandhya Rani and S. Umesh [2008]: ``Use of Spectral Center of Gravity for Generating Speaker Invariant Features for Automatic Speech Recognition'', To Appear in Interspeech-2008, Brisbane, Sep. 2008
-
P. T. Akhil, S. P. Rath, S. Umesh and D. R. Sanand [2008]: ``A Computationally Efficient Approach to Warp Factor Estimation in VTLN Using EM Algoirthm and Sufficient Statistics'', To Appear in Interspeech-2008, Brisbane, Sep. 2008
-
D. R. Sanand, D. Dinesh Kumar and S. Umesh [2007]: ``Linear Transformation Approach to VTLN Using Dynamic Frequency Warping,'' Proc. of International Conference on Spoken Language Processing (Interspeech 2007), Antwerp, Belgium, August 27-31, 2007. [Acceptance ratio: 59% = 748/1268]
-
S. Umesh, L. Cohen and D. Nelson [2007]: ``Fluctuations in speech,'' Proc. of Conference on Noise and Fluctuations in Biological, Biophysical, and Biomedical Systems, Florence, Italy, May 2007
-
S. Umesh, D. Rama Sanand, G. Praveen [2007]: ``Speaker-Invariant Features for Automatic Speech Recognition,'' Proc. of International Joint Conferences on Artificial Intelligence, (IJCAI-07), pp. 1738-1743, Jan. 2007 [Acceptance ratio: 15.5% = 212/1365]
-
S. V. Bharath, S. Umesh and R. Sinha [2006]: ``Study of Non-Linear Frequency Warping Functions for Speaker Normalization,'' To Appear in Proc. of IEEE International Conf. on Acoustic, Speech and Signal Processing, (ICASSP Toulouse), April 2006 [Acceptance ratio: 48.1% = 1465/3045]
-
J. Lööf and H. Ney and S. Umesh [2006]: ``VTLN Warping Factor Estimation Using Accumulation of Sufficient Statistics,'' To Appear in Proc. of IEEE International Conf. on Acoustic, Speech and Signal Processing, (ICASSP Toulouse), April 2006 [Acceptance ratio: 48.1% = 1465/3045]
-
S. Umesh, A. Zolnay and H. Ney [2005]: ``Implementing Frequency-Warping and VTLN Through Linear Transformation of Conventional MFCC,'' Proc. of InterSpeech 2005, (Lisbon, Portugal), Sep.'2005 [Acceptance ratio: 62% = 855/1379]
-
S. Umesh, L. Cohen and D. Nelson [2005]: ``The Speech Scale and Spectral Transformation,'' Proc. of SPIE Conference on Wavelet Applications in Signal & Image Proc., July'2005
-
S. V. Bharath and S. Umesh [2004]: ``Non-uniform speaker normalization using frequency-dependent scaling function,'' Proc. IEEE International Conference on Signal Processing and Communications, (Bangalore), December 2004
-
S. Tranter, M. J. Gales, R. Sinha, S. Umesh and P. Woodland [2004]: ``The Development of the Cambrdige University RT-04 Diarisation System,'' Proc. of 2004 Rich Transcription Workshop (RT-04) , (Palisades, NY, USA), November 2004
-
D. Kim, S. Umesh, M. J. Gales, T. Hain and P. Woodland [2004]: ``Using VTLN for Broadcast News Transcription,'' Proc. of International Conference on Spoken Language Processing , (ICSLP, Jeju Island, S.Korea), October 2004
-
S. V. Bharath, S. Umesh and R. Sinha [2004]: ``Non-Uniform Speaker Normalization using Affine Transformation,'' Proc. of IEEE International Conf. on Acoustic, Speech and Signal Processing, (ICASSP Montreal), Vol. I, pp.121-124, April 2004 Voted the top paper in its review category [Acceptance ratio: 51.8% = 1262/2434]
-
S. Umesh, R. Sinha and S. V. Bharath [2004]: ``An Investigation into Front-End Signal Processing for Speaker Normalization,'' Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, (ICASSP Montreal), Vol. I, pp.345-348, April 2004 [Acceptance ratio: 51.8% = 1262/2434]
-
D. Nelson, D. Smith, S. Umesh, L. Cohen [2003]: ``Estimating speaker scale factors from vowels,'' Proc. of SPIE Conference on Wavelets: Applications in Signal and Image Processing, , vol. 5207, pp. 794-800, July 2003.
-
R.Sinha and S. Umesh [2003]: ``A Method for Compensation of Jacobian in Speaker Normalization,'' Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, (ICASSP Hong Kong), April 2003
-
R. Sinha and S. Umesh [2003]: ``A Study into Front-End Signal Processing for Automatic Speech Recognition,'' Proc. of Workshop on Spoken Language Processing , (TIFR,Mumbai), pp. 87 - 92, January 2003
-
S. Umesh, L. Cohen and D. Nelson [2002]: ``The speech scale, the Mel scale and the Tube Model for Speech,'' Proc. of SPIE Conference on Advanced Signal Processing Algorithms, Architectures and Implementations, vol. 4791, pp. 7 - 23, July 2002.
-
R.Sinha and S. Umesh [2002]: ``Non-Uniform Scaling Based Speaker-Normalization,'' Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, (ICASSP Orlando, USA), Vol. I, pp. 589-592, May 2002 [Acceptance ratio: 56.9% = 1007/1770]
-
S. Umesh, S. V. Bharath, M. K. Vinay, R. Sharma and R. Sinha [2002]: ``A Simple Approach to Non-Uniform Vowel Normalization,'' Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, (ICASSP, Orlando, USA),Vol. I, pp. 517-520, May 2002 [Acceptance ratio: 56.9% = 1007/1770]
-
S. Umesh, D. Nelson and L. Cohen [2001]: ``Further Experimental Results on the Speech-Hearing Connection,'' Proc. of SPIE Conference on Wavelet Applications in Signal & Image Proc., Vol. 4478, pp. 361-366, July'2001
-
S. Umesh, Richard C. Rose, and S. Parthasarathy [2000]: ``Exploiting Frequency-Scaling Invariance Properties of the Scale Transform for Automatic Speech Recognition,'' in Proc. of International Conference on Spoken Language Processing, (ICSLP Beijing, China), pp. 651-654, Oct.'2000
-
D. Nelson, S. Umesh, and L. Cohen [2000]: ``High Frequency Formant Estimation & Its Application in Frequency-Scaling of Speech,'' Proc. of SPIE Conference on Wavelet Applications in Signal & Image Proc., vol. 4119, pp. 294-301, July'2000
-
S. Umesh, L. Cohen and D. Nelson [1999]: ``Scale-Transform Based Features for Application in Speech Recognition,'' Proc. of SPIE Conference on Wavelet Applications in Signal & Image Proc., Vol. 3813, pp.727-731, July'1999
-
S. Umesh, L. Cohen, and D. Nelson [1999]: ``Fitting the Mel-Scale,'' Proc. IEEE International Conference on Acoust. Speech, Signal Processing, (ICASSP Phoenix, Arizona, USA), Vol. 1, pp. 217-220, March 1999. [Acceptance ratio: 58.2% = 869/1490]
-
S.Umesh, L.Cohen and D.Nelson [1998]: ``Warping Functions in Speech'' Proc. of SPIE Conference on Wavelet Applications in Signal & Image Proc., Vol. 3458, pp.194-209, July'1998
-
S. Umesh, L. Cohen, and D. J. Nelson [1998]: ``Improved Scale-Cepstral Analysis in Speech,'' IEEE International Conference on Acoust. Speech, Signal Processing, (ICASSP Seattle, USA), pp. 637-640, May 1998.
-
S. Umesh, L. Cohen, and D. Nelson [1997]: ``Improvements in Scale-Cepstral Features for Speech Analysis,'' in Proc. SPIE Conference on Wavelet Applications in Signal & Image Proc., (San Diego, USA), vol. 3169, pp. 481-494, July 1997.
-
S.Umesh, A. Rao, G.Cristobal, L.Cohen and J.H. van Deemter [1997]: ``Global and local translation and magnification'' Proc. of SPIE Conference on Statistical & Stochastic Methods in Image Processing, Vol. 3167, pp.106-117, July'1997.
-
S. Umesh, L. Cohen, and D. J. Nelson [1997]: ``Frequency-Warping and Speaker-Normalization,''
IEEE International Conference on Acoust. Speech, Signal Processing, (ICASSP Munich, Germany), pp. 983-986, May 1997. -
S. Umesh, L. Cohen, N. Marinovic, and D. J. Nelson [1996]: ``Frequency-Warping in Speech,'' in Proc. International Conference on Spoken Language Processing, (ICSLP Philadelphia,USA), pp. 414-417, October 1996.
-
S. Umesh, L. Cohen, N. Marinovic, and D. J. Nelson [1996]: ``Psychoacoustic-Frequency Scales versus Frequency-Warping in Scale cepstrum ,'' in Proc. SPIE Conference on Wavelet Applications in Signal & Image Proc. , Vol. 2825, pp. 530-539, July 1996.
-
S. Umesh and D. J. Nelson [1996]: ``Computationally Efficient Estimation of Sinusoidal Frequency at low SNR,'' in Proc. IEEE International Conference on Acoust. Speech, Signal Processing, (ICASSP Atlanta, USA), pp. 2797-2800, May 1996.
-
L. Cohen, N. Marinovic, S. Umesh, and D. Nelson [1995]: ``Scale-Invariant Speech Analysis via joint time-frequency-scale processing,'' in Proc. SPIE Conference on Wavelet Applications in Signal & Image Proc., Vol. 2569, pp. 522-537, July 1995.
-
N. Marinovic, L. Cohen, S. Umesh, and D. Nelson [1995]: ``Classification of Digital Modulation Types,'' in Proc. SPIE Conference on Advanced Signal Processing Algorithms, vol. SPIE-2563, (San Diego, USA), pp. 125-143, July 1995.
-
N. Marinovic, L. Cohen, and S. Umesh [1994]: ``Joint Representations in Time and Frequency Scale for Harmonic Type Signals,'' in Proc. IEEE-SP International Symposium on T-F and T-S Representations, (Philadelphia, PA), pp. 84-87, October 1994.
-
N. Marinovic, L. Cohen, and S. Umesh [1994]: ``Scale and Harmonic Signal Analysis,'' in Proc. International Society of Optical Engineering Conference on Wavelet Applications in Signal & Image Proc., Vol. 2303, pp. 411-418, August 1994.
-
E. Wilson, S. Umesh, and D. W. Tufts [1993]: ``Multistage Neural Network Structure for Transient Detection and Feature Extraction,'' in Proc. IEEE International Conference on Acoust. Speech, Signal Processing, (ICASSP Minneapolis, USA), pp. 489-492, April 1993.
-
E. Wilson, S. Umesh, and D. W. Tufts [1992]: ``Designing a Neural Network Structure for Transient Detection Using the Subspace Inhibition Filter Algorithm,'' in Proc. IEEE Oceans '92, pp. 120-125 (Newport, USA), Oct. 1992.
-
E. Wilson, S. Umesh, and D. W. Tufts [1992]: ``Resolving the Components of Transient Signals Using the Neural Network and Subspace Inhibition Filter Algorithms,'' in Proc. International Joint Conference on Neural Networks, (Baltimore, USA), pp. 283-288, June 1992.
-
S. Umesh and D. W. Tufts [1992]: ``Resolving the Components of Transient Signals by a Multistage Procedure,'' in Proc. IEEE International Conference on Acoust. Speech, Signal Processing, (ICASSP San Francisco, USA), pp. 553-556, March 1992.
-
G. F. Boudreaux-Bartels, D. W. Tufts, and S. Umesh [1991]: ``On Improving the Detection of Gabor Components.,'' in Proc. of Mini ASSP Conference, (Boston, USA), April 1991.
(C): Papers in Conferences held in India
-
D. Dinesh Kumar, D. R. Sanand and S. Umesh [2008]: `` Linear Transformation Approach to Speaker Normalization on Conventional MFCC,’’ Proc. Of National Conference on Communications, IIT-Bombay, Feb-2008
-
R. Sandhya Rani, D. R. Sanand and S. Umesh [2008]: ``Speaker Normzalisation Using Center of Gravity’’ Proc. Of National Conference on Communications, IIT-Bombay, Feb-2008
-
S. P. Rath, D. R. Sanand and S. Umesh [2008]: `` MAP based Warping factor Estimation in Vocal Tract Length Normalization’’ Proc. Of National Confer+ence on Communications, IIT-Bombay, Feb-2008
-
Mohd Amir Khan, D. Rama Sanand, S. Umesh [2007]: ``Jacobian Compensation Using Variance Normalization in Automatic Speech Recognition,'' Proc. of National Conference on Communications, IIT-Kanpur, Jan. 2007
-
S. Umesh, R. Sinha, D Rama Sanand [2007]: ``Using Vocal-Tract Length Normalization in Recognition of Children Speech,'' Proc. of National Conference on Communications, IIT-Kanpur, Jan. 2007
-
R. Sinha and S. Umesh [2006]: ``Linear-Transformation Approach to Shift-Based Speaker-Normalisaion'' Proc. of National Conference on Communications , (IIT,Delhi), January 2006
-
S. Umesh and S. V. Bharath [2006]: ``Study of Non-linear Frequency Warping functions for Speaker Normalisation'' Proc. of National Conference on Communications , (IIT,Delhi), January 2006
-
R. Sinha and S. Umesh [2003]: ``Spectral Smoothing for Vocal-Tract Length Normalization,'' Proc. of National Conference on Communications , (IIT,Chennai), pp. 87 - 92, January 2003
-
S.Umesh, M. Belkhode and Rohit Sinha [1999]: ``Comparison of Front-End Features for Speech Recognition'' Proc. of National Conf. on Communications, (Kharagpur), pp.163-170, Jan. 1999
(D): Technical Workshop Presentations
-
S. Tranter and S. Umesh [2004]: ``Diarisation Research at CUED,'' Meta-Data Evaluation (MDE) Technical Meeting of U.S. ARPA's Effecti ve Affordable Reusable Speeech (EARS) Project, (Boston, USA), May 2004
-
D.Y. Kim, M.J.F. Gales, H.Y.Chan, P.C. Woodland, S. Umesh and T. Hain [2004]: ``Progress in Broadcast News English Transcription,'' Speech-to-Text (STT) Workshop of ARPA's EARS Project, (Montreal, Canada), May 2004
(E): Invited Talks
-
S. Umesh [2007]: ``Introduction to Large Vocabulary Continuous Speech Recognition'' National Conference on Communications, (IIT-Kanpur), Jan.2007
-
S. Umesh [2006]: ``Statistical Fundamentals for Speech Recognition'' Winter School on Speech & Audio Processing (WISSAP-06), (IISc., Bangalore), Jan. 2006
-
S. Umesh [2005]: ``Large Vocabulary Continuous Speech Recognition,'' International Conference on Natural Language Processing, (IIT, Kanpur), Dec. 2005
(F): Books
-
Ajit K. Chaturvedi, Srinivasan Umesh, Adrish Banerjee, Kameswari Chebrolu, Joseph John, Ayyangar R. Harish (Editors): Proceedings of the Thirteenth National Conference on Communications, I.I.T. Kanpur, 26-28 January 2007. ISBN Number: 978-81-904444-0-8
