Developmental Biology Neuroscience

scientificprotocols authored about 3 years ago

Authors: Douglas K. Bemis, Garry F. Marcus, Dina Lipkind & Ofer Tchernichovski


As part of a comparative study of early vocal learning in songbirds and human infants, we have analyzed the development of phonetic syntax in the babbling utterances of infants, using data from the CHILDES database. Here we present a detailed description of the data and analysis methods used.



Data for the analysis of human babbling was obtained from 9 children in the Davis corpus (Davis & MacNeilage, 1995) of the CHILDES database (MacWhinney, 2000) [Specifically – Ben, Cameron, Charlotte, Georgia, Paxton, Rachel, Rebecca, Rowan, Sam]. On average, these children were 9 months, 28.3 days old (2 months, 1.3 days std.) at the first session, and data was collected for an average of 1 year, 7 months (7 months, 12.8 days std.) following this time. The data for each child consisted of 38.8 sessions on average (10.2 std.) each recorded an average of 16.07 days apart (6.4 days std.).

Data Analysis

Our analysis was focused upon characterizing the structure of infant babbling – specifically the manner in which children combine syllables during prelinguistic vocal exploration. Thus, we restricted our analysis to utterances for which no lexical items were assigned in the CHILDES transcriptions. We parsed these babbled utterances into syllables from the transcribed phonemes using a semi-automated method, as describe below. We then performed our analysis over those utterances that received a complete syllabic parse. On average, this amounted to 2135 utterances per child (924 std.) and 62.0 utterances per session (37.5 std.)

Parsing Algorithm

The CHILDES data consists of phonetically transcribed utterances. To convert this data into syllables, we used an iterative process to parse each babbling utterance into a sequence of accepted syllables. An utterance was only considered parsed (and so usable in the analysis) if every phoneme in the utterance was successfully assigned to a syllable by this algorithm. In other words, we only considered an utterance for our analysis if we could fit a sequence of syllables to the phonemes such that each phoneme was used exactly one time in a syllable. On each iteration of the parsing algorithm, we first manually assigned complete syllabic parses to several unparsed utterances. We then added these new syllables to the set of possible syllables that could be used to parse the utterances. Next, we automatically checked all of the utterances to see if they could be exhaustively parsed using the current store of syllables. For example, an utterance badaja would be manually assigned the syllabic parse of ba, da, ja. On the following automatic parsing pass, an utterance baja could be automatically parsed into the syllables ba and ja. Utterances that could not be fully parsed using the set of defined syllables were then manually parsed, adding to the set of acceptable syllables. Thus, every syllable used to parse the data was manually verified as a valid syllable within the data. If an utterances could be assigned two different parses, then we employed a heuristic such that we chose the parse with the greater number of two phoneme syllables (CV or VC). If multiple parses for an utterance were equal in this measure, we manually assigned a parse to the utterance or left it as ambiguous. All such ambiguous parses were excluded from the analysis. This process was then repeated until a sizeable amount of the data had been parsed from phonemes into syllables (58.2% of babbling utterances [19.0% std.]).

Initial Tabulation

From this set of parsed babbling utterances, we then tabulated the frequency of each syllable within each session and its placement within its utterance. To parallel the birdsong analysis, we then restricted our analysis below to those syllables that reached a certain frequency threshold during a session, set at 1% of the total number of syllables in the session. Thus, we focused only on those syllables that the child produced at a non-negligible rate within a given recording session. Finally, we also calculated the frequency of all transitions between the syllables. A transition was defined as any two sequential syllables that occurred within a single utterance.

Bootstrapped Normalization

Ultimately, we were interested in characterizing the evolution of transitional variability over time. Clearly, any measure of this development is confounded to some degree by the growth over time both in the number of syllables used by a child as well as the length of its utterances. Thus, to control for these factors, we used a bootstrapped normalization procedure for the statistics reported below. To establish a baseline value for each measure that reflected a random placement of syllables but held vocabulary size and utterance length constant, we shuffled syllables randomly within each recording session while maintaining the length of each utterance. Thus, the order of the syllables was randomized while the frequency of each syllable type remained constant within each session, as did the length of each utterance. The statistics reported below were then recalculated over these bootstrapped randomizations to establish a baseline value. By comparing the observed data to these measures, we were able to isolate effects due to the structure imposed by the child on the babbling, controlling for changes over time in the number of syllables used by the child and the length of its utterances.


The measures described below were calculated for each syllable type in each session. Sessions were then aligned on a syllable type’s first appearance, and a mean over syllable types was calculated for each session. Only syllable types that first appeared in the course of the experimental period (namely, that were not present at the first session) were included in the analysis.

Performance of new syllables at edges: we calculated the proportion of occurrences of a syllable type at the edge of an utterance compared to those that were in the middle of an utterance. For the purposes of this measure, we did not count reduplications (i.e. repetitions of a syllable) as being in the middle of the utterance. These frequencies were then compared to a randomized (bootstrapped) baseline.

Addition of new transitions: we calculated the number of new transitions per syllable used in each session, and compared it to baseline levels. This measure indicates how likely each syllable occurrence was to participate in a previously unseen transition.

Reduplication: we calculated the proportion of occurrences of each syllable type in reduplicated transitions (e.g., AA) versus occurrences in variegated transition (e.g., AB), and compared it to baseline levels.


Nine children from the Davis corpus (Davis & MacNeilage, 1995) of the CHILDES database (MacWhinney, 2000) :

Ben, Cameron, Charlotte, Georgia, Paxton, Rachel, Rebecca, Rowan, Sam.


The analysis was performed using Matlab code that is available for download at (see procedures).


  1. Download matlab codes for analysis from (click on the link saying ’+ shared protocol for babbling analysis’. The codes are in a zipped folder.
  2. Download infant vocal transcriptions data from
    • If you would like to repeat the babbling analysis used in the paper, download the 9 infants listed in the “Reagents’ section from
    • Put all data in a folder within the folder containing the matlab codes (point 1). Name the folder ‘English-Davis_CHAT’ or update the matlab code accordingly (see below).
  3. Open matlab and the ‘Analysis.m’. This script contains two cells. The first parses the .cha files from the database and the second produces the graphs presented in the paper.


  1. Make sure that the matlab codes are placed in a directory for which a path is set in matlab (see help for ‘set path’ in matlab).
  2. In case your infant data directory is named differently than ‘English-Davis_CHAT’, replace the correct folder name in the ‘Analysis.m’ file.

Anticipated Results

Matlab plots containing the analyses shown in Figure 4 in the paper (Stepwise Acquisition of combinatorial capacity in songbirds and human infants).


  1. Davis, B. L., & MacNeilage, P. F. (1995). The articulatory basis of babbling. Journal of Speech & Hearing Research, 38, 1199-1211.
  2. MacNeilage, P. F. (2008). The origin of speech: Oxford University Press, USA.
  3. MacNeilage, P. F., & Davis, B. L. (2000). On the Origin of Internal Structure of Word Forms. Science, 288, 527-531.
  4. MacNeilage, P. F., & Davis, B. L. (2011). In Defense of the “Frames, then content” (FC) Perspective on Speech Acquisition: A Response to Two Critiques. Language Learning and Development, 7, 234-242.
  5. MacNeilage, P. F., Davis, B. L., & Matyear, C. L. (1997). Babbling and first words: Phonetic similarities and differences. Speech Communication, 22, 269-277.
  6. MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk (Vol. Third Ed.): Psychology Press.
  7. Oller, D. K. (1980). The emergence of the sounds of speech in infancy. In G. Yeni-Komshian, J. F. Kavanagh & G. A. Ferguson (Eds.), Child phonology 1. Production (pp. 93-112). New York, NY: Academic Press.
  8. Smith, B. L., Brown-Sweeney, S., & Stoel-Gammon, C. (1989). A quantitative analysis of reduplicated and variegated babbling. First Language, 9, 175-189.
  9. Stark, R. E. (1980). Stages of speech development in the first year of life. In G. Yeni-Komshian, J. F. Kavanagh & G. A. Ferguson (Eds.), Child phonology 1. Production (pp. 73-90). New York, NY: Academic Press.


The study was supported by a US Public Health Service grant to Ofer Tchernichovski.

Associated Publications

Stepwise acquisition of vocal combinatorial capacity in songbirds and human infants. Dina Lipkind, Gary F. Marcus, Douglas K. Bemis, Kazutoshi Sasahara, Nori Jacoby, Miki Takahasi, Kenta Suzuki, Olga Feher, Primoz Ravbar, Kazuo Okanoya, and Ofer Tchernichovski. Nature 498 (7452) 104 - 108 doi:10.1038/nature12173

Author information

Douglas K. Bemis, INSERM-CEA Cognitive Neuroimaging unit, CEA/SAC/DSV/DRM/Neurospin center, Bât 145, Point Courier 156, F-91191 Gif-sur-Yvette Cedex FRANCE

Garry F. Marcus, Psychology Department, New York University

Dina Lipkind & Ofer Tchernichovski, Ofer Tchernichovski Lab, Hunter College, CUNY

Correspondence to: Douglas K. Bemis ([email protected])

Source: Protocol Exchange (2013) doi:10.1038/protex.2013.057. Originally published online 10 June 2013.

Average rating 0 ratings