Authors: Douglas K. Bemis, Garry F. Marcus, Dina Lipkind & Ofer Tchernichovski
As part of a comparative study of early vocal learning in songbirds and human infants, we have analyzed the development of phonetic syntax in the babbling utterances of infants, using data from the CHILDES database. Here we present a detailed description of the data and analysis methods used.
Data for the analysis of human babbling was obtained from 9 children in the Davis corpus (Davis & MacNeilage, 1995) of the CHILDES database (MacWhinney, 2000) [Specifically – Ben, Cameron, Charlotte, Georgia, Paxton, Rachel, Rebecca, Rowan, Sam]. On average, these children were 9 months, 28.3 days old (2 months, 1.3 days std.) at the first session, and data was collected for an average of 1 year, 7 months (7 months, 12.8 days std.) following this time. The data for each child consisted of 38.8 sessions on average (10.2 std.) each recorded an average of 16.07 days apart (6.4 days std.).
Our analysis was focused upon characterizing the structure of infant babbling – specifically the manner in which children combine syllables during prelinguistic vocal exploration. Thus, we restricted our analysis to utterances for which no lexical items were assigned in the CHILDES transcriptions. We parsed these babbled utterances into syllables from the transcribed phonemes using a semi-automated method, as describe below. We then performed our analysis over those utterances that received a complete syllabic parse. On average, this amounted to 2135 utterances per child (924 std.) and 62.0 utterances per session (37.5 std.)
The CHILDES data consists of phonetically transcribed utterances. To convert this data into syllables, we used an iterative process to parse each babbling utterance into a sequence of accepted syllables. An utterance was only considered parsed (and so usable in the analysis) if every phoneme in the utterance was successfully assigned to a syllable by this algorithm. In other words, we only considered an utterance for our analysis if we could fit a sequence of syllables to the phonemes such that each phoneme was used exactly one time in a syllable. On each iteration of the parsing algorithm, we first manually assigned complete syllabic parses to several unparsed utterances. We then added these new syllables to the set of possible syllables that could be used to parse the utterances. Next, we automatically checked all of the utterances to see if they could be exhaustively parsed using the current store of syllables. For example, an utterance badaja would be manually assigned the syllabic parse of ba, da, ja. On the following automatic parsing pass, an utterance baja could be automatically parsed into the syllables ba and ja. Utterances that could not be fully parsed using the set of defined syllables were then manually parsed, adding to the set of acceptable syllables. Thus, every syllable used to parse the data was manually verified as a valid syllable within the data. If an utterances could be assigned two different parses, then we employed a heuristic such that we chose the parse with the greater number of two phoneme syllables (CV or VC). If multiple parses for an utterance were equal in this measure, we manually assigned a parse to the utterance or left it as ambiguous. All such ambiguous parses were excluded from the analysis. This process was then repeated until a sizeable amount of the data had been parsed from phonemes into syllables (58.2% of babbling utterances [19.0% std.]).
From this set of parsed babbling utterances, we then tabulated the frequency of each syllable within each session and its placement within its utterance. To parallel the birdsong analysis, we then restricted our analysis below to those syllables that reached a certain frequency threshold during a session, set at 1% of the total number of syllables in the session. Thus, we focused only on those syllables that the child produced at a non-negligible rate within a given recording session. Finally, we also calculated the frequency of all transitions between the syllables. A transition was defined as any two sequential syllables that occurred within a single utterance.
Ultimately, we were interested in characterizing the evolution of transitional variability over time. Clearly, any measure of this development is confounded to some degree by the growth over time both in the number of syllables used by a child as well as the length of its utterances. Thus, to control for these factors, we used a bootstrapped normalization procedure for the statistics reported below. To establish a baseline value for each measure that reflected a random placement of syllables but held vocabulary size and utterance length constant, we shuffled syllables randomly within each recording session while maintaining the length of each utterance. Thus, the order of the syllables was randomized while the frequency of each syllable type remained constant within each session, as did the length of each utterance. The statistics reported below were then recalculated over these bootstrapped randomizations to establish a baseline value. By comparing the observed data to these measures, we were able to isolate effects due to the structure imposed by the child on the babbling, controlling for changes over time in the number of syllables used by the child and the length of its utterances.
The measures described below were calculated for each syllable type in each session. Sessions were then aligned on a syllable type’s first appearance, and a mean over syllable types was calculated for each session. Only syllable types that first appeared in the course of the experimental period (namely, that were not present at the first session) were included in the analysis.
Performance of new syllables at edges: we calculated the proportion of occurrences of a syllable type at the edge of an utterance compared to those that were in the middle of an utterance. For the purposes of this measure, we did not count reduplications (i.e. repetitions of a syllable) as being in the middle of the utterance. These frequencies were then compared to a randomized (bootstrapped) baseline.
Addition of new transitions: we calculated the number of new transitions per syllable used in each session, and compared it to baseline levels. This measure indicates how likely each syllable occurrence was to participate in a previously unseen transition.
Reduplication: we calculated the proportion of occurrences of each syllable type in reduplicated transitions (e.g., AA) versus occurrences in variegated transition (e.g., AB), and compared it to baseline levels.
Nine children from the Davis corpus (Davis & MacNeilage, 1995) of the CHILDES database (MacWhinney, 2000)
Ben, Cameron, Charlotte, Georgia, Paxton, Rachel, Rebecca, Rowan, Sam.
The analysis was performed using Matlab code that is available for download at http://ofer.sci.ccny.cuny.edu/publications (see procedures).
Matlab plots containing the analyses shown in Figure 4 in the paper (Stepwise Acquisition of combinatorial capacity in songbirds and human infants).
The study was supported by a US Public Health Service grant to Ofer Tchernichovski.
Stepwise acquisition of vocal combinatorial capacity in songbirds and human infants. Dina Lipkind, Gary F. Marcus, Douglas K. Bemis, Kazutoshi Sasahara, Nori Jacoby, Miki Takahasi, Kenta Suzuki, Olga Feher, Primoz Ravbar, Kazuo Okanoya, and Ofer Tchernichovski. Nature 498 (7452) 104 - 108 doi:10.1038/nature12173
Douglas K. Bemis, INSERM-CEA Cognitive Neuroimaging unit, CEA/SAC/DSV/DRM/Neurospin center, Bât 145, Point Courier 156, F-91191 Gif-sur-Yvette Cedex FRANCE
Garry F. Marcus, Psychology Department, New York University
Dina Lipkind & Ofer Tchernichovski, Ofer Tchernichovski Lab, Hunter College, CUNY
Correspondence to: Douglas K. Bemis ([email protected])
Source: Protocol Exchange (2013) doi:10.1038/protex.2013.057. Originally published online 10 June 2013.