Background
Prior research has identified pitch and rhythm as key components of music cognition (Dowling & Fujitani, 1971; Fitch, 2013), but recent studies suggest that listeners are also sensitive to other musical features (Honing, 2018). Novel efforts have, therefore, been made using familiar music to evaluate how perceptual manipulations impact music recognition in memory tasks (Li et al., 2023). Until recently, however, such studies have largely focused on single components (e.g., pitch) and have been limited to studying single societies, restricting insights into how environmental factors can influence these cognitive processes.
Aim
Expanding on earlier work (Li et al., in preparation), the current study investigates the relative salience of spectral and temporal information in the cognition of culturally familiar and unfamiliar music, aiming to assess how the manipulation of these features impacts music recognition within and across different societies.
Methods
We use the matching-pairs (MP) paradigm (TuneTwins; Li et al., 2023), to evaluate how degradations in the spectral and temporal domains affect musical memory across corpora from three cultures: Western TV theme-songs, BaYaka music from Congo, and traditional Chinese instrumental music.
Stimuli are progressively degraded using joint time–frequency scattering (JTFS; Andén et al., 2019) to compare the impact of spectral and temporal information loss on music cognition. MP integrates these degraded stimuli into a gamified, memory-based task, where participants match eight pairs of musical stimuli in as few moves as possible. This enables us to evaluate their capacity to recall and match stimuli where one of the pair have undergone varying levels of spectrotemporal degradation using JTFS. An international participant group will be recruited, with each game iteration selecting randomly from the three corpora and degradation levels.
Expected results
We anticipate that spectrotemporal degradation (via JTFS) will enable us to separate the roles of spectral and temporal dimensions on music cognition and assess the extent to which the degradation of each component affects listeners’ task performance in MP. In addition, we expect cultural familiarity will moderate these effects, where participants from some societies (e.g., Western listeners) will rely primarily on components of pitch over rhythm (i.e., they will be more sensitive to spectral degradations), whilst other societies may demonstrate comparable reliance in both dimensions.
Discussion and conclusion
By degrading musical stimuli along spectral and temporal dimensions (Lostanlen & Hecker, 2019), this study compares the relative importance of these two acoustic parameters. Our work builds on Albouy et al. (2020), who suggest that music perception relies more on spectral information, while temporal features are more critical in speech processing. Testing this across different cultural groups allows us to assess whether this asymmetry—where spectral cues appear more prominent in music perception—is a consistent feature of music cognition or varies between societies. If an asymmetry under systematic spectrotemporal degradation does not emerge, it may suggest that sensitivity to rhythm is not a specialised feature of speech processing. Conversely, if a consistent asymmetry is observed, this would further support the idea of specialised perceptual tuning for music, highlighting the importance of spectral cues in music recognition.
Prior research has identified pitch and rhythm as key components of music cognition (Dowling & Fujitani, 1971; Fitch, 2013), but recent studies suggest that listeners are also sensitive to other musical features (Honing, 2018). Novel efforts have, therefore, been made using familiar music to evaluate how perceptual manipulations impact music recognition in memory tasks (Li et al., 2023). Until recently, however, such studies have largely focused on single components (e.g., pitch) and have been limited to studying single societies, restricting insights into how environmental factors can influence these cognitive processes.
Aim
Expanding on earlier work (Li et al., in preparation), the current study investigates the relative salience of spectral and temporal information in the cognition of culturally familiar and unfamiliar music, aiming to assess how the manipulation of these features impacts music recognition within and across different societies.
Methods
We use the matching-pairs (MP) paradigm (TuneTwins; Li et al., 2023), to evaluate how degradations in the spectral and temporal domains affect musical memory across corpora from three cultures: Western TV theme-songs, BaYaka music from Congo, and traditional Chinese instrumental music.
Stimuli are progressively degraded using joint time–frequency scattering (JTFS; Andén et al., 2019) to compare the impact of spectral and temporal information loss on music cognition. MP integrates these degraded stimuli into a gamified, memory-based task, where participants match eight pairs of musical stimuli in as few moves as possible. This enables us to evaluate their capacity to recall and match stimuli where one of the pair have undergone varying levels of spectrotemporal degradation using JTFS. An international participant group will be recruited, with each game iteration selecting randomly from the three corpora and degradation levels.
Expected results
We anticipate that spectrotemporal degradation (via JTFS) will enable us to separate the roles of spectral and temporal dimensions on music cognition and assess the extent to which the degradation of each component affects listeners’ task performance in MP. In addition, we expect cultural familiarity will moderate these effects, where participants from some societies (e.g., Western listeners) will rely primarily on components of pitch over rhythm (i.e., they will be more sensitive to spectral degradations), whilst other societies may demonstrate comparable reliance in both dimensions.
Discussion and conclusion
By degrading musical stimuli along spectral and temporal dimensions (Lostanlen & Hecker, 2019), this study compares the relative importance of these two acoustic parameters. Our work builds on Albouy et al. (2020), who suggest that music perception relies more on spectral information, while temporal features are more critical in speech processing. Testing this across different cultural groups allows us to assess whether this asymmetry—where spectral cues appear more prominent in music perception—is a consistent feature of music cognition or varies between societies. If an asymmetry under systematic spectrotemporal degradation does not emerge, it may suggest that sensitivity to rhythm is not a specialised feature of speech processing. Conversely, if a consistent asymmetry is observed, this would further support the idea of specialised perceptual tuning for music, highlighting the importance of spectral cues in music recognition.
References
Albouy, P., Benjamin, L., Morillon, B., & Zatorre, R. J. (2020). Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody. Science, 367(6481), 1043–1047. https://doi.org/10.1126/science.aaz3468
Andén, J., Lostanlen, V., & Mallat, S. (2019). Joint time–frequency scattering. IEEE Transactions on Signal Processing, 67(14), 3704–3718. https://doi.org/10.1109/TSP.2019.2918992
Dowling, W. J. & Fujitani, D. S. (1971). Contour, interval, and pitch recognition in memory for melodies. The Journal of the Acoustical Society of America, 49, 524–531. https://doi.org/10.1121/1.1912382
Fitch, W.T. (2013). Rhythmic Cognition in humans and animals: distinguishing meter and pulse perception. Frontiers in Systems Neuroscience, 7(68). https://doi.org/10.3389/fnsys.2013.00068
Honing, H. (2018). Musicality as an upbeat to music: Introduction and research agenda. In Honing, H., The Origins of Musicality. The MIT Press. https://doi.org/10.7551/mitpress/10636.003.0004
Li, J., Baker, D. J., Burgoyne, J. A., & Honing, H. (2023). Is pitch information indispensable for music recognition? a pilot study based on a musical matching pairs game. In The e-proceedings of the 17th International Conference on Music Perception and Cognition and the 7th Conference of the Asia-Pacific Society for the Cognitive Sciences of Music. https://hdl.handle.net/11245.1/a765dc1c-8543-4909-8ebd-e0a28995b11e
Li, J., Baker, D.J., Burgoyne, J.A., Han, H., Henry, N., Lostanlen, V., Sadakata, M., van der Vlist, M.M.C., van Schaik, F.T.M., Janmaat. K., & Honing, H. (in preparation). Is Pitch Indispensable for Music Recognition? Robust Music Recognition Under Spectral and Temporal Degradations in BaYaka Hunter-Gatherers.
Lostanlen, V., & Hecker, F. (2019). The shape of RemiXXXes to come: audio texture synthesis with time-frequency scattering. Proceedings of the International Conference on Digital Audio Effects (DAFx). https://doi.org/10.48550/arXiv.1906.09334