Publications and conference presentations (selected)

Tornike Karchkhadze, Mohammad Rasool Izadi, Shuo Zhang. Improving Source Extraction with Diffusion and Consistency Models. Audio Imagination: NeurIPS 2024 Workshop. [OpenReview][preprint]
Aleksandra Ma, Sile Yin, Shuo Zhang. Audio-Visual Target Speaker Speech Enhancement. SANE 2024 Workshop, Cambridge, MA.
Tornike Karchkhadze, Hassan Salami Kavaki, Mohammad Rasool Izadi, Bryce Irvin, Mikolaj Kegler, Ari Hertz, Shuo Zhang, Marko Stamenovic. Latent CLAP Loss for Better Foley Sound Synthesis. Proc. EUSIPCO 2024.[eusipco] [preprint]
Mohammad Rasool Izadi, Yujia Yan, Shuo Zhang, Robert Stevenson. Towards Optimal Voice Disentanglement With Weak Supervision. Proc. ICASSP 2024.[IEEE Explore][related: poster at SANE 2022]
Bryce Irvin, Sile Yin, Shuo Zhang, Marko Stamenovic.A Fullband Neural Network For Audio Packet Loss Concealment. ICASSP 2024 Audio Deep Packet Loss Concealment Grand Challenge[paper]. In Proc. of ICASSPW 2024. [IEEE Explore]
J. Williams, T. Azim, A. -M. Piskopani, A. Chamberlain and S. Zhang. “Socio-Technical Trust For Multi-Modal Hearing Assistive Technology,” 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece, 2023, pp. 1-5. [IEEE Explore]
N Shashaank, Berker Banar, Mohammad Rasool Izadi, Jeremy Kemmerer, Shuo Zhang, Chuan-Che (Jeff)Huang . HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones. Proceedings of the ICASSP 2023, Rhodes Island, Greece. [arXiv][IEEE Explore]
Alnajjar,K, Hämäläinen,M, Zhang,S. Ring That Bell: A Corpus and Method for Multimodal Metaphor Detection in Videos. Proceedings of the Third Workshop on Figurative Language Processing (FigLang) at EMNLP 2022. [U of Helsinki] [Zenodo][ACL Anthology]
Zhang,S. Data mining Mandarin tone contour shapes. Proceedings of 16th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology (at ACL 2019). Association for Computational Linguistics: Florence, Italy, August 2019. [preprint][acl anthology]
Caro, R, Zhang, S, Serra, X. Quantitative analysis of the relationship between linguistic tones and melody in jingju using music scores. Proceedings of the 3rd International workshop on Digital Libraries for Musicology (DLfM) at International Society for Music Information Retrieval Conference (ISMIR) 2017, Shanghai, China, October 2017. Published by ACM-ICPS. [MTG][UPF e-repositori][ACM-Digital Library]
Zhang,S., Caro,R, Serra,X. Understanding the expressive functions of jingju metrical patterns through lyrics text mining. Proceedings of the 18th International Society for Music Information Retrieval (ISMIR) conference, Suzhou, China, October 2017. [MTG] [UPF e-repositori]
Zhang, S. Mining Linguistic Tone Patterns Using Fundamental Frequency Time-Series Data. Ph.D Dissertation, Department of linguistics, Georgetown University (2017). [GU]
Zhang, S., Zeldes, A. GitDOX: A Linked Version Controlled Online XML Editor for Manuscript Transcription. Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference (FLAIRS 30, published by AAAI Press), May 22-24, 2017, Marco Island, Florida, FL. [pdf @ AAAI Press]
Zhang,S. RankLyrics: A ranking-based approach to automatic song lyrics generation. MASCSLL2017. [pdf]
Zhang, S. Mining linguistic tone patterns with symbolic representation. Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology (SIGMORPHON at ACL 2016), Association for Computational Linguistics, Berlin, Germany, August 2016.[ACL anthology]
Zeldes,A, Zhang, S. When Schemas Change Rules Help : A Configurable Approach to Coreference beyond OntoNotes. In: Proceedings of the NAACL2016 Workshop on Coreference Resolution Beyond OntoNotes (CORBON). Association for Computational Linguistics, San Diego, CA, June 2016. [ACL Anthology].
Zhang,S, Caro, R, Serra,X. Predicting pairwise pitch contour relations based on linguistic tone information in Beijing opera singing. Proceedings of the 16th International Society for Music Information Retrieval (ISMIR) conference, Malaga, Spain, October 26th-30th, 2015. [mtg]
Zhang, S, Caro, R, Serra,X. Study of the similarity between linguistic tones and melodic pitch contours in Beijing Opera singing. Proceedings of The 15th International Society for Music Information Retrieval (ISMIR) Conference, pp.345-348. Taiwan, October, 27-31 2014. [mtg]
Zhang,S. Modeling Frequency Effects in Mandarin Zero-onset Variation. NWAV 43, Chicago, IL, 2014. [abstract]