Background

Segmenting continuous streams

Speech and action sequences are both continuous information streams that must be successfully segmented into constituent sub-units in order to be understood.


speech segmentation example
Unsuccessful speech segmentation.
action segmentation example
Successful action segmentation.

In both the speech and action domain, we know this segmentation task is achieved via a combination of top-down and bottom-up processing.

Top-down and bottom-up processing

Top-down processing involves the application of pre-existing knowledge to determine where boundaries between phrases occur.

Bottom-up processing involves processing of properties of the stimulus to determine boundary location.

Work with adults has highlighted top-down and bottom-up cues that support segmentation of speech and action. For example:

top-down-bild bottom-up-bild
spectro
Speech

Listeners apply their knoweldge of word meanings and grammar to determine the locations of boundaries in speech (e.g. Mattys et al., 2007).

Prosodic cues (e.g. pasue and pre-boundary lengthening) are produced at phrase boundaries (e.g. Wagner & Watson, 2010), and listeners detect these cues to determine the location of phrase boundaries in speech (e.g. Schafer et al., 2000).

tying laces
Action Sequence

Observers track the goals and intentions of an actor, and map boundaries to moments of goal-achievement (e.g. Levine et al., 2017).

The movement itself contains kinematic cues to boundaries between actions. These kinematic cues include pause and pre-boundary lengthening (Hilton et al., 2019), and observers make use of these kinematic cues to determine the location of boundaries in the action sequence (Hemeren & Thill, 2010).

Developmental Perspective

  • Infants’ access to top-down processes is restricted, because they do not yet possess the relevant knowledge/experience.
  • In speech, it has been proposed that infants therefore initially capitalise on bottom-up cues (prosody) to segment the stream (Prosodic Bootstrapping Account; Gleitman & Wanner, 1982 )

  • However, little is known about infants’ processing of bottom-up cues during action segmentation.
  • In parallel to early speech processing, infants may capitalize on kinematic boundary cues to initially segment actions, especially when the actions are unfamiliar or not goal-directed.

Research Questions and Aims

  1. Are infants sensitive to kinematic boundary cues?
    • We measured the electrophysiological response to kinematic boundary cues.
    • We examined whether this response is similar to that evoked by prosodic boundary cues.
  2. Do the kinematic boundary cues modulate processing of the subsequent action?
    • Finding that infants are sensitive to kinematic boundary cues would not automatically mean that these cues play a role in action processing.
    • We therefore examined whether the kinematic boundary cues modulate action processing, by examining electropsyshiological response to action that do/do not follow kinematic boundary cues.

Procedure

Row

Stimuli

  • Three child-friendly characters were created:

character 1

character 2

character 3

  • These characters were then animated to perform sequences of three actions.

  • Two action sequences were defined:

  1. Turn then stretch then jump
  2. Jump then stretch then turn
  • On no-boundary trials each sequence was shown as a single continuous sequence.

  • On boundary trials, a boundary was signalled between the second and final action.

  • On boundary trials, the boundary was signalled by two kinematic boundary cues:

  1. Pre-boundary lengthening: The second action “stretch” involved the character expandind outwards in the horizontal plane, before shrinking back to its original size. To achieve pre-boundary lengthening, the speed with which the character shrank back to its original size was slowed, extending the overal duration of the action by 240 ms.
  2. Pause: Following the completion of the pre-boundary action, the character paused motionless for 350 ms.
  • These timings were based on typical durations of pre-boundary lengthening and pause as found in naturally-produced speech, and durations of the actions forming the sequences were:
No-boundary Trial
Boundary trial
Element Duration (ms) Duration (ms)
still frame 1000 1000
action 1 600 600
action 2 600 840
pause 0 350
action 3 750 750
  • Each character performed both sequences with and without a boundary, resulting in a final stimuli set of 12 videos (3 characters x 2 sequences x 2 trial types).

  • Below, you can see an example of a no-boundary trial and a boundary trial.
    • Note: On web-browsers, these videos do not play at their full time-resolution, meaning that they may appear somewhat jumpy. When presented to participants in the lab, the videos were correctly rendered and smooth.

Row

No-boundary trial

Boundary trial

Row

Participants

12-month-old infants (N = 27) from German-speaking households were tested.

Sample characteristics
N Mage SD % girls
27 11.7 months 0.7 48%
Summary statistics: Number of artefact-free trials contributed by participants
Condition Mean no. of artefact-free trials Range
no-boundary 23.2 15 - 34
boundary 25.0 14 - 37

EEG testing

  • Infants were shown the 12 stimulus videos in a ranomized order while we recorded EEG. Stimuli were presented until the infants became bored and thus looked consistently away from the screen.

  • EEG was recorded from 30 electrodes.

  • 9 of these electrodes served as critical electrodes for analysis (F3, Fz, F4, C3, Cz, C4, P3, Pz, P4).

Question 1

Were infants sensitive to the kinematic boundary cues?

Closure Positive Shift (CPS):


An ERP component initially discovered in response to prosodic boundary cues in speech (Steinhauer et al., 1999) . This component is a slow, broadly distributed positivity in the ERP that begins around the onset of the boundary and lasts approximately 500 ms (Boegels et al., 2011) .


  1. The CPS has been found in response to prosodic boundary cues in speech already during the first year of life (Holzgrefe et al., 2018).
  2. In adults, a CPS-like positivity has been found in response to kinematic boundary cues in action sequences (Hilton et al., 2019).

Column

Slow-motion video - ERP for whole sequence

Column

ERP during CPS window - Regions

Individual Electrodes

Column

Analysis & Conclusion

For every trial, segments from the mid-point of the second action until the mid-point of the third action were exported for analysis. The maximum amplitude in each segment was calculated, resulting in an analysis of mean maximum amplitude during this time interval.

A 2 (condition: no-boundary vs. boundary) x 3 (region: frontal vs. central vs. posterior) repeated-measures ANOVA was performed on the mean maximum amplitide data:

effect F df p \(\eta_{G}^{2}\)
condition 73.11 1, 26 <.001 0.342
region 19.08 2, 52 <.001 0.065
condition*location 2.70 2, 52 0.077 0.004
  • These results indicate a positive shift in the boundary condition relative to the no-boundary condition.
  • This difference in the ERPs indicates that the infants detected the kinematic boundary cues.
  • The ERP response to kinematic boundary cues is CPS-like, suggesting that the cognitive processes underlying the processing of kinematic cues could be similiar to those involved in the processing of prosodic boundary cues.

Question 2

Do kinematic boundary cues modulate processing of subsequent actions?

Negative central (Nc) component:


A negative peak in the ERP over fronto-central electrodes emerging between 300 and 900 ms following stimulus onset, implicated in attentional processing (e.g., Nelson & Collins, 1991; Reynolds & Richards, 2005) . Has recently been taken as a measure of action processing during infancy, reflecting attention to and encoding of an individual actions (Monroy et al., 2019).

Column

ERP

Column

Analysis & Conclusion

For every trial, the Nc was analysed by exporting the minimum amplitude from the ERP in the 250 ms - 750 ms time interval following the onset of each action. The mean minimum amplitude was then averaged across six fronto-central electrodes (F3, Fz, F4, C3, Cz, C4), and analysed with a 3 (action: first, second, final) x 2 (condition: boundary, no-boundary) repeated measures ANOVA.

effect F df p \(\eta_{G}^{2}\)
condition 0.65 1, 26 0.426 0.003
action 4.16 2, 52 0.021 0.021
condition*action 7.73 2, 52 0.001 0.026
  • These results suggest that each action evoked an Nc component, except the final action in the no-boundary condition.
  • The kinemtic boundary cues thus modulated processing of the subsequent action.
  • The final action in the no-boundary could have been encoded as a continuation of the second action, hence no Nc-response.
  • Alternatively, the final action in the no-boundary condition may have overloaded infants’ processing capacity. Kinematic boundary cues in the boundary condition could however have propmpted the chunking and storage of previous actions, freeing-up capacity for the final action.

tl;dr

We are interested in how infants process boundaries between individual actions of an action sequence.
Work with adults suggests that kinematic cues (properties of the movement) can signal the location of boundaries in action sequences.
We presented 12-month-old infants with cartoon action sequences while recording EEG.
Half of the sequences contained kinematic boundary cues (pre-boundary lengthening and pause).
We found evidence of an ERP component indicating the infants detected and processes the kinematic boundary cues.
The kinematic cues also modulated infants’ processing of subsequent actions.
We contend that these low-level kinematic cues play a role in early action segmentation and processing.

Poster

Contact

Name and Address

Matt Hilton

Karl-Liebknecht-Str. 24-25

14476 Potsdam

Contact Info

Telefon: 0331 977 2485

E-Mail:

Website: www.matthilton.de