BCBAs & Therapists Archives - Page 2 of 2

Comparison of mastery criteria applied to individual targets and stimulus sets on acquisition of tacts, intraverbals, and listener responses

Jan 5, 2026

Maria Clara Cordeiro , Tiffany Kodak, Jessi Reidy, Abigail Stoppleworth, Karly Zelinski and Andrea Jainga
Marquette University

Abstract

Mastery criteria can be applied to individual targets or stimuli organized into sets. Wong et al. (2021) and Wong and Fienup (2022) found that participants who received special education services learned sight words more rapidly when an individual target mastery criterion was applied. The current study replicated and extended these findings across novel skills. Five participants with ASD received tact or intraverbal training in Experiment 1, and 2 participants with ASD received auditory–visual conditional discrimination training (AVCD) in Experiment 2. In both experiments, mastery criteria were applied to targets and stimulus sets to compare sessions to mastery. Results showed the target mastery criterion required fewer sessions of tact training for 3 of 5 participants and AVCD training for both participants. However, overselection of stimuli occurred for 20% of AVCD mastered targets, suggesting a false positive for acquisition of those targets. Maintenance was similar across conditions and experiments.

Key words: autism spectrum disorder, false positives, listener training, mastery criteria, tact training

We thank Lauren Debertin, Kirsten Lloyd, Marisa McKee, Courtney Meyerhofer, Diana Meredith, Alyssa Scott, and Xi’an Williams for their assistance with data collection.
Address correspondence to: Dr. Tiffany Kodak, 525 N 6th St., Marquette University, Milwaukee, WI 53203.
Email: tiffany.kodak@marquette.edu doi: 10.1002/jaba.946
© 2022 Society for the Experimental Analysis of Behavior (SEAB).

Discrete-trial instruction (DTI) involves the arrangement of antecedents (e.g., motivating operations, discriminative stimuli) to occasion target behavior and a preferred consequence to increase the future likelihood of behavior. Although DTI programs include these components, their arrangements vary. One component that can vary is the arrangement of stimuli into teaching sets and the mastery criteria applied to those stimuli. For example, a set of three targets can be taught simultaneously (Maurice et al., 1996), and the learner completes training of one stimulus set and begins instruction for a second stimulus set once all three targets in that first set meet a mastery criterion (e.g., Kodak et al., 2020). Alternatively, stimuli may be taught individually or as a fluctuating set (Lovaas, 1981), and the mastery criterion may be applied to each target. Once an individual target is mastered, the mastered target is replaced by a new target and training with the newly constituted set continues (Knutson et al., 2019).

The selection of mastery criteria can influence the efficacy of instruction and maintenance of skills. Research suggests that more stringent mastery criteria (e.g., 90%-100%) result in acquisition of skills that maintain for longer durations, in comparison to less stringent mastery criteria (e.g., 80%; Fuller & Feinup, 2018; Richling et al., 2019). Further, the application of mastery criteria to sets of stimuli versus individual targets can influence the efficiency of instruction (Wong et al., 2021; Wong & Fienup, 2022). For example, teaching stimuli in sets may result in extended exposure to instruction if acquisition of one or more targets in the set is delayed while the learner consistently responds correctly to all other targets in the set (e.g., Kodak et al., 2020). In this case, use of mastery criteria applied to targets may prevent the loss of instructional time because any mastered targets can be replaced with novel targets.

Researchers have recently begun to investigate the effects of mastery criteria applied to stimulus sets versus individual targets (e.g., Wong et al., 2021; Wong & Fienup, 2022). Wong et al. (2021) compared the effects of mastery criteria applied to stimulus sets (set analysis; SA) and individual targets (operant analysis; OA) on skill acquisition and maintenance. Four learners who received special education services participated, and textual responses to sight words were targeted for instruction. In the SA condition, each set of four sight words was mastered once correct responding reached 100% for one session. In the OA condition, an individual target within the set of three sight words was mastered once correct responding to that target reached 100% for one session. Once all sight words were mastered in the OA condition, the experimenters applied the OA mastery criterion to the remaining SA targets. All four participants mastered more sight words in the OA condition. Two of the four participants maintained a higher percentage of responding to targets taught in the SA condition, and two participants showed similar maintenance of responses to targets taught in both conditions.

Although novel outcomes were obtained by Wong et al. (2021), the experimenters applied a decision-making protocol to targets that were not quickly mastered, and this modification was applied disproportionately to the OA condition. In addition, the authors discontinued use of the SA mastery criterion once all sight words were taught in the OA condition, thereby preventing an analysis of differences in the efficiency of instruction of targets exposed to varying mastery criteria. Wong and Fienup (2022) addressed these limitations by conducting a replication that maintained the same instructional procedures across SA and OA conditions. The authors also modified the mastery criterion from 100% correct responses for one session to two consecutive sessions to address potential confounds in evaluating maintenance. Three participants who received special education services were taught sight words arranged in SA and OA conditions. All participants acquired textual responses more rapidly in the OA condition, replicating findings in Wong et al. However, participants did not show differences in maintenance across conditions, as observed by Wong et al. Instead, maintenance of responses to targets was similar across conditions.

The findings of Wong et al. (2021) and Wong and Fienup (2022) provide preliminary support for the application of mastery criteria to individual targets. Nevertheless, these studies compared set and target mastery criteria^[1]applied to teaching of sight words only. Fienup and Carr (2021) suggested continued research on mastery criteria applied to different skills. Thus, replications of Wong and Fienup should be conducted with other skills that are likely to be grouped into stimulus sets for instruction (e.g., tacts, auditory–visual conditional discrimination [AVCD]) and for which target and set mastery criteria may be arranged in practice. Also, use of mastery criteria applied to targets could result in false positives for acquisition. For example, during AVCD training in practice, a learner could touch the same comparison stimulus across all trials, and that target would be considered mastered according to an individual target mastery criterion. This would be an example of a false positive for acquisition of that target; acquisition of the target occurs when the participant touches the picture (e.g., a hippo) when and only when the auditory stimulus that corresponds to that picture (i.e., “hippo”) is presented. Touching the picture of the hippo should not occur when other auditory stimuli in the stimulus set (e.g., “alligator” and “flamingo”) are presented. This type of false positive could not occur when a set mastery criterion is applied to training, because the learner must engage in correct responses to all stimuli in the set to meet mastery. Further analysis of response patterns during teaching should be included to identify instances of false positives produced by use of certain acquisition mastery criteria. For example, learners who select the same target in every trial during AVCD training would demonstrate responding that is considered mastered according to the target mastery criterion used by Wong et al. (2021) and Wong and Fienup (2022).

The purpose of the current investigation was to systematically replicate and extend Wong et al. (2021) and Wong and Fienup (2022) by investigating the application of mastery criteria to sets and targets across different skills frequently taught in skill-acquisition programs; tact and intraverbal training occurred in Experiment 1, and AVCD training occurred in Experiment 2. In addition, we analyzed data for potential false positives of acquisition during AVCD training.

Experiment 1

Method

Participants and Setting

Five children with a medical diagnosis of autism spectrum disorder (ASD) participated in Experiment 1. All participants were previously exposed to mastery criteria applied to sets of stimuli in daily programming. Two participants (Omar and Tim) also had prior exposure to mastery criteria applied to individual targets during tact training for a brief period more than 2 years prior to the current investigation. All participants engaged in vocal-verbal behavior as their primary mode of communication.

Omar was a 7-year-old, Middle Eastern boy who had received 4 years of behavior-analytic intervention. His tact and listener repertoires were within Level 3 (30-48 months) on the Verbal Behavior Milestones Assessment and Placement Program (VB-MAPP; Sundberg, 2008). Josh was an 8-year-old, European American boy who communicated with one- to three-word phrases and had received 1 year of behavior-analytic intervention. His tact and listener repertoires were within Level 2 (18-30 months) on the VBMAPP. Tim was an 8-year-old, European American boy who had received 5 years of behavior analytic intervention to reduce severe problem behavior and increase functional communication. His tact and listener repertoires were within Level 3 (30-48 months) on the VB-MAPP. He received 3 months of telehealth services at the start of the investigation and required caregiver assistance for the entire duration of appointments. Connor was a 9-year-old, European American adolescent who spoke in complete sentences, participated in general education for academic instruction at school, and had received 1 year of in-person behavior-analytic intervention. He received 3 months of telehealth services prior to his participation and engaged in intervention independently (i.e., with no caregiver assistance). Billy was a 6-year-old, European American boy who had received 2 years of behavior-analytic intervention. His tact and listener repertoires were within Level 3 (30-48 months) on the VB-MAPP.

All sessions were conducted in a quiet room with minimal distractions; the location of sessions remained the same across sessions for each participant. In-person sessions took place in a university-based clinic in the Midwest for Omar, Josh, and Bill, whereas sessions for Tim and Connor occurred in their home via telehealth. A chair, table, and relevant instructional and data collection materials were present during sessions.

Materials

Laminated stimulus cards (approximately 12.7 cm x 9.4 cm) with images printed in color were utilized with Omar, Josh, and Billy (Table 1). Stimuli were delivered in PowerPoint^®via the Zoom^®screen share function for Tim and Connor (Table 1). Math problems for Connor were presented in 239-point black Calibri font centered on a white slide. Single- and double-digit numbers were presented in a vertical array (i.e., numerator on the top and denominator on the bottom) with the division symbol adjacent to the denominator.

Preferred tangible and edible items were included during teaching. Preferred items were identified via previous paired-stimulus or brief multiple-stimulus-without-replacement preference assessments (Carr et al., 2000; Fisher et al., 1992) conducted daily or multiple times per day, depending on the participant, and participant mands (e.g., Omar’s mand “I want the marble machine” resulted in access to a marble run during the reinforcer interval). Connor received tally marks on a whiteboard following correct responses. Each tally mark was worth 15 s of video game play, and Connor was permitted to exchange the points immediately or accumulate and exchange them at the end of the session.

Table 1

Participant Target Stimuli Across Experiments

Participant	Experiment	Modality of Stimulus Presentation	Skill	Examples
Omar	1	Printed Laminated Cards	Adjective-Noun Tacts	sleeping baby, old man, hollow tree, deep hole
	2	Printed Laminated Cards	AVCD	Akita, Saluki, Boston Terrier, Weimaraner
Josh	1	Printed Laminated Cards	Tacts	chips, blanket, peas, needle, eraser
	2	Printed Laminated Cards	AVCD	Phineas, Scooby Doo, Squidward, Lumpy Space Princess
Tim	1	PowerPoint	Tacts of Objects or Actions	clothes, kite, egg, chew, hit, fly
Connor	1	PowerPoint	Intraverbals (Division problems)	423, 3417, 284, 459
Billy	1	Printed Laminated Cards	Adjective-Noun Tacts	sleeping baby, old man, hollow tree, deep hole

Response Measurement

Data were collected on correct independent responses, errors, and no responses. The primary dependent variable was correct independent responses defined as the participant emitting a vocalization within 5 s that corresponded to the discriminative stimulus (S^D). An error was defined as the participant emitting a vocalization other than the targeted response (excluding vocal stereotypy) or multiple responses (e.g., saying, “boat-bus” or “bus-boat” for the target bus). No response was defined as no emission of vocalizations within 5 s of the S^D. Correct independent responses were converted to a percentage by dividing the number of trials with a correct independent response by the total number of trials and multiplying by 100.

Data also were collected on overtraining at the conclusion of the condition comparison. Overtraining trials in the set condition were calculated using an identical method to Wong et al. (2021) and Wong and Fienup (2022) for their SA condition. That is, data from the set condition were graphed per target stimulus. Once a target in the set condition reached mastery, we counted the number of additional sessions of training completed for that target in order to meet the set mastery criterion. For example, if target 1 in a set met the target mastery criterion in three sessions, but 10 total sessions of training were required to reach the set mastery criterion, we calculated that an additional seven sessions of unnecessary training occurred for target 1. Because each target was presented three times per session, we multiplied the number of additional training sessions conducted past mastery by three (e.g., target 1 had 7 additional training sessions x 3 trials per stimulus = 21 overtraining trials for target 1). We replicated this method with the other two targets in the set to obtain the total number of overtraining trials per set. Then, we summed the number of overtraining trials per set (i.e., added the number of overtraining trials from targets 1, 2, and 3). Finally, the total trials of overtraining were divided by the total trials conducted until the mastery criterion was met and multiplied by 100 to obtain a percentage of overtraining trials per set (i.e., training trials allocated to overtraining). Also, the average number of overtraining trials in the set condition was calculated by dividing the sum of overtraining trials across all sets (e.g., overtraining trials for set 1 + set 2 + set 3, etc.) by the total number of targets per set (i.e., 15). Results of this analysis are reported in Table 2 for Experiments 1 and 2.

Interobserver Agreement and Treatment Integrity

Interobserver agreement (IOA) was calculated using the trial-by-trial method. Two independent observers collected data on all dependent variables from video recordings or in person. An agreement was defined as two observers scoring all the same dependent variables (e.g., incorrect response) during each trial. Interobserver agreement was calculated for a minimum of 34% of sessions for each participant. The number of trials with agreements was divided by the total number of trials per session then converted to a percentage. All sessions with IOA were averaged to calculate the mean agreement for each participant (Table 3).

Treatment integrity (TI) data were also collected for two measures. First, TI data for the teaching procedure were collected on a trial-bytrial basis for a minimum of 34% of sessions per participant. A trained observer collected data on the experimenter’s implementation of all components in each trial according to a protocol. The trained observer scored if the experimenter secured the participant’s attending to the target stimulus, presented the correct instruction, presented prompts at the appropriate prompt delay (i.e., 5 s), delivered the appropriate error correction (when necessary), provided praise and tangible reinforcers at the correct times, and did not add additional treatment components. The trial was scored as correct (1) if the experimenter conducted all components of the trial according to the experimental protocol. Deviations from the protocol resulted in a score of 0 for the trial. The number of trials implemented correctly was divided by the number of trials in a session and converted to a percentage (Table 2).

Treatment integrity data also were collected on the correct implementation of the mastery criterion in each condition for each participant. The mastery criterion was scored as correctly applied to a target or set when the experimenter ended teaching once two consecutive sessions with 100% correct responding occurred for the target or set. If the experimenter continued teaching any of the mastered targets or sets, this was scored as an error (0). The number of targets or sets for which the mastery criterion was correctly applied was divided by the total number of targets or sets in the condition. Mean TI was calculated for each condition by dividing the sum of mastery criteria correctly applied across targets or across participants by the total number of participants, multiplied by 100. Mean TI for correct implementation of the mastery criterion across participants was 100% for the target condition and 100% for the set condition.

Experimental Design

An adapted alternating treatments design (Cariveau & Fetzner, 2022; Sindelar et al., 1985) was used to compare the effects of an individual target versus set mastery criterion on acquisition. Sessions of each condition were alternated until all assigned stimuli were mastered in one condition. Thereafter, sessions of the remaining condition were conducted in succession until all stimuli in the remaining condition were mastered.

Table 2
Sessions of Overtraining Trials in the Target Condition

		Experiment 1
Overtraining Trials	Omar	Josh		Tim	Connor	Billy
Set 1 (%) Set 2 (%) Set 3 (%) Set 4 (%) Set 5 (%) Total (%) Total Targets Mastered Average Overtraining Trials per Target	3 (7%) 3 (8%) 0 9 (20%) 9 (20%) 24 (1%) 15 1.6	18 (29%) 40 (56%) 63 (44%) 45 (42%) 27 (38%) 273 (55%) 15 18.2		30 (37%) 12 (19%) 12 (22%) 6 (6%) 21 (29%) 81 (30%) 15 5.4	21 (33%) 18 (40%) 12 (22%) 30 (48%) 3 (11%) 84 (44%) 15 5.6	6 (17%) 6 (17%) 12 (27%) 0 6 (17%) 30 (20%) 15 2
		Experiment 2
Overtraining Trials			Omar			Josh
Set 1 (%) Set 2 (%) Set 3 (%) Set 4 (%) Set 5 (%) Total (%) Total Targets Mastered Average Overtraining Trials per Target			45 (50%) 9 (25%) 3 (11%) 3 (11%) 3 (11%) 63 (35%) 15 4.2			30 (56%) 3 (8%) 24 (38%) 6 (83%) 24 (20%) 54 (27%) 15 6

Identification of Stimuli and Baseline

A baseline assessment was conducted to select target stimuli for inclusion in each condition and to verify similar levels of incorrect responding to each stimulus prior to teaching (Wong et al., 2021; Wong & Fienup, 2022). Stimuli included in the baseline assessment were selected based on participants’ individualized intervention goals (e.g., adjective–noun tacts). The experimenter presented a stimulus approximately 1 m from the participant’s face and asked, “What is it?” Participants had 5 s to respond, and no differential consequences were provided for responses. A minimum of three presentations of each stimulus were conducted, and stimuli were alternated across trials in sessions. The experimenter presented a mastered task trial after approximately every two trials, and correct independent responses to mastered tasks resulted in 20-s access to a tangible item or a small piece of a preferred edible for all participants except Connor. Connor received a point for each correct independent response that could be exchanged immediately or accumulated and exchanged at the end of the session.

Stimuli to which participants emitted correct independent responses were omitted from the investigation. The experimenters used a logical analysis to equate targets in each set (Cariveau et al., 2021; Wolery et al., 2018) based on visual similarity, overlapping sounds, and number of syllables. Thirty targets were identified per participant, and each condition consisted of 15 targets. Targets were grouped into sets of three in the set condition.

Procedure

One to two sessions of each condition were conducted per day, 3 to 5 days per week. Sessions included nine trials with three targets presented three times each. Targets were presented in a block (e.g., each of the three targets was presented once) before presenting the targets again in a different order. Trials were presented exactly as in baseline, except a vocal model prompt was delivered if 5 s elapsed without a correct independent response. Both correct independent and prompted responses resulted in praise and 20-s access to a tangible item, small piece of a preferred edible, or a point (Connor).

An interspersed mastered tact error-correction procedure was implemented with all participants (Plaisance et al., 2016). Following an incorrect response, the experimenter provided a vocal model of the correct response, provided brief praise following a prompted correct response, and presented a mastered-tact trial using the same 5-s prompt delay procedure to teach study targets. Once the participant engaged in a correct response to the mastered tact, the experimenter re-presented the target stimulus for the trial. This procedure was repeated until a correct independent response was emitted, or five errorcorrection trials occurred without a correct independent response.

Condition Comparison

Set Condition. The purpose of this condition was to evaluate acquisition of tacts or intraverbals when a mastery criterion was applied to a set of targets. Stimuli presented in this condition were organized into five sets of three targets (see supporting information for all participants’ stimulus sets). Mastery was met for the set when the participant engaged in 100% correct independent responses across two consecutive sessions. The mastered set of three targets was then replaced with a novel set of three targets for which instruction was introduced. Sessions were conducted until all five sets were mastered.

Target Condition. The purpose of this condition was to evaluate acquisition of tacts or intraverbals when a mastery criterion was applied to individual targets. Mastery was met for this condition when the participant engaged in 100% correct independent responses to one target across two sessions. The mastered target was then replaced with a novel target and instruction continued. When nearly all the targets were mastered in this condition, a mastered tact or intraverbal from a different program (i.e., none of the 30 targets included in the investigation) was included in the session to ensure that nine trials were conducted during every session. For example, if the participant had mastered 13 of 15 tact targets, and two untrained tact targets remained, a mastered tact not included as a target in either condition was presented along with the two remaining untrained targets during instruction in the nine-trial session.

Maintenance

The purpose of maintenance was to assess if the mastery criterion assigned to each condition resulted in differences in correct responding over time. Maintenance sessions were conducted at 1-, 3-, and 5-week intervals following the day on which the targets were mastered. Once a set of targets in the set condition or an individual target in the target condition was mastered, the set or target was moved to the maintenance phase. Participants did not receive any additional exposure to, nor training of targets outside of maintenance sessions. Maintenance session procedures were like teaching (i.e., correct responses produced reinforcers), except error correction was excluded (i.e., no consequences occurred following an error) and unrelated mastered tasks (tacts or intraverbals not included in either condition) were interspersed approximately every two trials. Correct responses to unrelated mastered tasks resulted in praise and a tangible or point.

Results and Discussion

Participants’ cumulative data across both conditions are presented in Figure 1. Three participants (Josh, Tim, and Connor) acquired targets in fewer teaching sessions in the target condition than in the set condition, whereas differences across conditions were minimal (i.e., difference of ≥ 2 sessions to mastery across conditions) for two participants (Omar and Billy). Josh required 30, Tim required 21, and Connor required eight additional sessions of instruction to reach mastery in the set condition in comparison to the target condition. Omar’s and Billy’s results showed minimal differences between conditions, and they required only one or two additional session(s) of instruction, respectively, to acquire all targets in the set condition in comparison to the target condition. These data show tact and intraverbal responses were acquired in fewer teaching sessions by three participants when a mastery criterion was applied to targets rather than sets.

We observed two patterns in outcomes across conditions for participants, and participants were grouped based on these observed similarities. Figures 2 and 3 show representative data for one participant from each group. Omar (whose data are like Connor’s and Billy’s) showed minimal differences in acquisition of targets assigned to the set mastery criterion (Figure 2). That is, applying a mastery criterion to sets of stimuli did not result in extended exposure to teaching for any targets. In contrast, Josh (whose data are like Tim’s) showed a pattern of extended exposure to teaching for certain targets within stimulus sets (Figure 3). For example, in Set 2, Josh’s correct responding reached 100% for targets 1 and 2 in three teaching sessions. However, increases in correct responding were delayed and correct responding was variable for target 3 in the set. Due to delayed acquisition of target 3, the two other targets (targets 1 and 2) were exposed to an additional 20 sessions of teaching after reaching two consecutive sessions at 100%. Similarly, for Set 3, target 1 reached 100% correct responding in three sessions, and target 3 reached 100% correct responding in eight sessions. Delays in acquisition of target 2 resulted in eight additional sessions of teaching before the entire set met the mastery criterion. This same pattern of delayed mastery based on extended training of one target was observed for all of Josh’s sets.

Table 3 shows overtraining trials in the set condition for each stimulus set and participant. Recall that overtraining trials are additional instruction on individual targets that reached the mastery criterion before the entire stimulus set was mastered. The set condition produced an average of 1.6 to 18.2 overtraining trials per target. This resulted in the allocation of 1% to 55% of instructional time spent teaching targets that were already mastered (individually) due to applying mastery criteria to a set rather than individual targets.

Figure 1: Cumulative Number of Tacts or Intraverbals Mastered Across Conditions in Experiment 1

Figure 4 shows maintenance data at 1-, 3-, and 5-week intervals following mastery for each participant. All participants except Tim demonstrated maintenance of 75% or more of stimuli for both conditions with one exception. In week 5, Josh’s maintenance in the target condition decreased to 7 out of 15 targets, suggesting that less exposure to teaching based on a target mastery criterion could have hindered his longterm maintenance of responding correctly to these targets. An additional phase of maintenance was introduced for Tim due to a pattern of no responses during maintenance probes. In the initial maintenance sessions, Tim responded correctly to the beginning trials of each session. Part way through each session, Tim engaged in a no-response and contacted the absence of prompts. Thereafter, he did not respond during the 5-s response interval in the remaining trials and frequently whispered the correct response after the experimenter removed the stimulus at the end of each trial modified maintenance phase included prompts so that a response was required for every trial. Following this modification, Tim’s correct independent responding increased to levels that were similar across conditions and consistent with other participants’ maintenance data.

Although the results of Experiment 1 show the benefits of implementing a target mastery criterion to maximize learning opportunities in client programming for three of the five participants, it is possible that these benefits may lead to inaccurate outcomes in certain situations. For example, when teaching AVCD, participants demonstrate mastery of targeted discriminations when they select the target only when the corresponding auditory stimulus is presented and not when other auditory stimuli are presented. Thus, a set mastery criterion may be necessary during some types of training (e.g., AVCD) to accurately measure acquisition. The purpose of the second experiment was to evaluatemasterycriteriaappliedtosets versus targets when teaching AVCD and measure whether false positives occurred.

Figure 2: Percentage of Correct Independent Responses Per Target for Sets 1-5 in the Set Condition for Omar in Experiment 1

Experiment 2

Method

Participants, Setting, Materials, and Experimental Design

Omar and Josh participated in Experiment 2 since they each had intervention goals related to AVCD training. Materials included laminated stimulus cards, data collection materials, and preferred items provided during teaching. The experimental design was identical to that in Experiment 1.

Response Measurement

Data were collected on correct independent responses, errors, and no responses. The primary dependent variable was correct independent responses defined as the participant touching the comparison stimulus that corresponded to the auditory sample stimulus during the 6-s response interval. An error was defined as the participant touching a comparison stimulus that did not correspond to the auditory sample or touching more than one comparison stimulus. No response was defined as the participant refraining from touching any stimulus during the response interval. Correct independent responses were converted to a percentage by dividing the number of trials with a correct independent response by the total number of trials and multiplying by 100.

Data also were collected on the occurrence of a response to a stimulus in each trial of the session. Each stimulus served as a S+ on three trials and an S- on six trials per session. If the participant selected the picture of a dog in five of the nine trials, then this stimulus was scored as being selected in five trials. These data were used to calculate the frequency of overselecting a stimulus in trials during the last two sessions in which the target met the mastery criterion. Overselecting stimuli was only calculated in the target condition; overselecting stimuli was not possible when a set reached mastery because participants had to respond correctly to all targets in the set during 100% of trials. Overselecting a stimulus was defined as selecting a stimulus in more trials than it was programmed as an S+ (i.e., selecting the stimulus on more than three trials). We calculated the percentage of targets for which overselection occurred by dividing the number of targets that were multiplying by 100. Any occurrence of overselected when they were not an S+ (during selecting stimuli during the two sessions in the final two sessions of training) by the total which the target was mastered was identified number of stimuli in the condition (15) and as a false positive for acquisition.

Figure 3: Percentage of Correct Independent Responses Per Target for Sets 1-5 in the Set Condition for Josh in Experiment 1

Figure 4: Number of Targets Maintained across Weeks in Experiment 1

Note. Asterisks denote weeks in which data for specific targets were not collected due to extraneous circumstances such as appointment cancellation (e.g., quarantines, absences).

Interobserver Agreement and Treatment

Integrity

Interobserver agreement (IOA) was calculated using the trial-by-trial method exactly as in Experiment 1 (Table 2). Treatment-integrity (TI) data were also calculated on a trial-by-trial basis for a minimum of 47% of sessions per participant. The trained observer scored if the experimenter secured the participant’s visual attending to the stimulus array, presented the correct auditory sample stimulus, repeated the sample every 2 s, presented prompts at the appropriate prompt delay (i.e., 6 s), delivered the identity-matching picture prompt (when necessary), provided praise and tangible reinforcers at the correct times, and did not add additional treatment components. The trial was scored as correct (1) if the experimenter conducted all components of the trial according to the experimental protocol. Deviations from the protocol resulted in a score of 0 for the trial. All other procedures were identical to those in Experiment 1 (Table 2).

Treatment-integrity data also were collected on the implementation of the target and set mastery criteria in an identical manner to Experiment 1. Mean TI for correct implementation of the mastery criteria across participants was 100% for the target condition and 100% for the set condition.

Identification of Stimuli and Baseline

A baseline assessment was conducted to select target stimuli for inclusion in each condition and to verify similar levels of incorrect responding to each stimulus as a measure prior to teaching. The experimenter presented a three-stimulus horizontal array placed approximately 15 cm in front of the participant on the table. The experimenter ensured attending to the array, presented the auditory sample stimulus, and repeated it every 2 s (e.g., “cup,” “cup,” “cup” with 2 s between presentations;

Bergmann et al., 2021). Due to the timing of the sample re-presentation, participants had 6 s to engage in a response. No differential consequences were provided for responding. Stimuli were alternated across baseline trials so that each stimulus was targeted as the correct response on some trials and served as an incorrect comparison stimulus on other trials. The experimenter presented a mastered task approximately every two trials, and correct independent responses to mastered tasks resulted in praise and 20-s access to a tangible or a small edible. Inclusion criteria and procedures for the assignment of stimuli to conditions was identical to Experiment 1.

Procedure

One to two sessions of each condition were conducted per day, 3 to 5 days per week. Sessions included nine trials with three stimuli presented three times each. Stimuli were presented in a block exactly as in Experiment 1 except the locations of correct (S+) and incorrect comparison stimuli (S-) within the three-stimulus array were also counterbalanced (i.e., the location of the S+ occurred in each position once per session per target).

Trials were presented exactly as in baseline, except the experimenter implemented error correction by providing an identity-matching picture prompt (IMPP; Fisher et al., 2007) if the participant did not engage in a correct independent response. During the IMPP, the experimenter held up a visual stimulus that matched the S+ in the array, re-presented the auditory sample (e.g., “cup”), and repeated the auditory sample every 2 s during the 6-s response interval. If an incorrect or no response occurred following the IMPP, the experimenter repeated the auditory sample and physically guided the correct response. Following a prompted correct response, the trial was re-presented to provide an opportunity for a correct independent response (data not included in figures). Error correction continued until the participant engaged in a correct independent response or five error-correction trials occurred. Correct independent and prompted responses resulted in praise and 20-s access to a tangible or a small edible.

The mastery criteria for set and target conditions were identical to those in Experiment 1. All maintenance procedures were identical to Experiment 1 (i.e., procedures were like AVCD teaching without error correction).

Results and Discussion

Figure 5 shows participants’ cumulative data across both conditions. Omar and Josh both required fewer sessions to mastery for stimuli in the target condition in comparison to the set condition. Omar required seven additional teaching sessions to acquire targets in the set condition. Josh required 11 additional teaching sessions to acquire targets in the set condition. These results partially replicate those from Experiment 1 with a different skill, although Omar showed minimal difference in sessions to mastery across conditions in Experiment 1. Thus, the mastery criteria applied to targets versus sets when teaching AVCDs in Experiment 2 resulted in larger differences in sessions to mastery across conditions for Omar.

Both participants acquired AVCD responses to stimuli in the target condition in two to six sessions (black bars; Figure 6). White squares plotted on a secondary y-axis show the percentage of overselecting responses per target. These data are representative of the last two sessions (i.e., sessions in which mastery was met). Stimuli should have been selected in 33% of the trials when reaching mastery. Omar and Josh engaged in overselecting responses to three targets (20% of targets). Although the percentage of overselecting responses per target was never at or near 100%, any overselecting responses could be considered problematic because they represent continued errors despite identification of the target as mastered. Thus, 20% of AVCD stimuli in the target condition produced a false positive for acquisition for both participants. Selection responses for each target were also recorded in the set condition (Figures 7 and 8). However, use of a set mastery criterion eliminated instances of overselecting responses during mastery because the participant could not engage in an incorrect response and reach the 100% mastery criterion for the set.

Omar required the most sessions of teaching for set 1 (Figure 8) due to ongoing errors to targets 1 and 3. He required 10 sessions of teaching to meet mastery for set 1. In comparison, Omar’s responding reached the mastery criterion for sets 2-5 in three or four teaching sessions. Josh engaged in overselecting responses to target 1 that delayed mastery of set 3. If the mastery criterion had been applied to a target rather than the set, target 1 would have been considered mastered despite selection of target 1 in five out of nine trials (rather than the programmed three trials in which target 1 was the S+). However, teaching continued for set 3 and overselecting of target 1 decreased, and Josh’s responding met the mastery criterion for set 3 in seven teaching sessions. Josh’s responding met mastery in the other teaching sets in four to six teaching sessions.

Table 3 shows overtraining trials for the set condition for both participants. Omar had an average of four overtraining trials per target, and Josh had an average of six overtraining trials per target. As a result, 35% and 27% of Omar’s and Josh’s teaching time, respectively, was allocated to overtraining trials.

Omar and Josh maintained targets in both conditions across all weeks with one exception (Figure 9). Omar responded incorrectly to one target in the set condition at week 1; however, he responded correctly to this target in subsequent weeks. These results partially replicate maintenance outcomes of Experiment 1, replicate maintenance data in Wong and Fienup (2022), and further suggest that maintenance of responding to targets was similar across conditions.

Figure 5: Cumulative Number of AVCD Targets Mastered Across Conditions in Experiment 2

Figure 6: Number of Sessions to Mastery and the Percentage of Selection Responses Per Stimulus in the Target Condition for Omar and Josh in Experiment 2

Figure 7: Percentage of Correct Independent Responses and Selection Responses Per Stimulus in the Set Condition for Omar in Experiment 2

Figure 8: Percentage of Correct Independent Responses and Selection Responses Per Stimulus in the Set Condition for Josh in Experiment 2

Figure 9: Number of Targets Maintained across Weeks in Experiment 2

General Discussion

A mastery criterion applied to targets resulted in fewer teaching sessions for tacts and intraverbals for three of five participants in Experiment 1 and AVCD targets for both participants in Experiment 2. These results replicate Wong et al. (2021) and Wong and Fienup (2022) with neurodivergent participants and novel skills. In Experiment 1, we observed patterns of responding similar to previous studies in that three participants acquired tacts or intraverbals in fewer sessions in the target condition in comparison to the set condition. Mastery in the set condition was often delayed due to one target remaining in training after the other targets in the set were mastered, as in previous studies (Wong et al., 2021; Wong & Fienup, 2022).

We observed a similar pattern of delayed acquisition for the set condition in Experiment 2, extending the findings of Wong et al. (2021) and Wong and Fienup (2022) to instruction on listener skills. Omar and Josh required an additional seven and 11 sessions of instruction, respectively, in the set condition in comparison to the target condition in Experiment 2 (AVCD training). Interestingly, there were minimal delays in acquisition during AVCD training due to the consistent selection of one target in the set condition (i.e., selection response per stimulus graphs, Figures 7 and 8). Although some learners with ASD may display consistent biases to specific stimuli (e.g., Kodak et al., 2015), neither participant showed exclusive responding to one stimulus in any of the AVCD teaching sessions across sets. Nevertheless, overselection occurred to 20% of targets in the target condition at the point of mastery (Figure 6). Thus, selecting a target more often than it was programmed for instruction during trials occurred for 20% of targets during the two sessions leading to mastery, which resulted in false positives for acquisition in the target condition.

Overselection of a stimulus in the array can be problematic when a target mastery criterion is used because the mastery criterion can be met despite the participant engaging in biased responding to a stimulus in the array (Grow et al., 2011). For example, if presented with pictures of a Boston Terrier, Weimaraner, and Dachshund, the participant’s responding may meet mastery for the Boston Terrier if that stimulus is selected on most or all trials in the session. This pattern of responding would incorrectly identify mastery of the Boston Terrier during AVCD training, which would result in a false positive for acquisition. This same pattern of biased responding also could occur during tact training if the participant engages in the same response during all tact training trials (e.g., the participant says, “Boston Terrier” on all trials). Although we did not evaluate this pattern of biased responding during tact and intraverbal training in Experiment 1, the overselection results during AVCD instruction in Experiment 2 (i.e., false acquisition for 20% of stimuli in the target condition) suggest this behavior should be measured if a target mastery criterion is applied to instruction. Further, because Wong et al. (2021) and Wong and Fienup (2022) did not report overselection during sight word instruction, it is unclear whether and to what extent persistent patterns of biased responding and false positives occurred in previous studies. Future research should further investigate the likelihood of overselection responses that can lead to false positives across tasks when a target mastery criterion is applied during skill-acquisition programs.

If a target mastery criterion is applied to training, practitioners could revise the target mastery criterion in the current and previous studies to prevent overselection from leading to false acquisition. The target mastery criteria could include criteria related to discriminated responding across targets in addition to correct responses to the target. For example, during AVCD training with a target mastery criterion, practitioners could require learners to respond correctly to the stimulus (e.g., the picture of the hippo) on all trials in which that stimulus is targeted (i.e., when the instructor says, “hippo) and not respond to the stimulus in trials when other stimuli are targeted (e.g., when the instructor says, “alligator” and “flamingo”; when the picture of a hippo is a S-). If these revised criteria were applied in the present investigation, participant responding in Experiment 2 would have met the mastery criteria when the participant (1) engaged in 100% correct independent responses to the target and (2) did not touch the target when it was an S- on 100% of trials across two consecutive sessions.

Although false positives for acquisition were observed for 20% of stimuli in the set condition in Experiment 2, correct responding to those stimuli maintained up to 5 weeks after the end of training. One of those stimuli was associated with one incorrect response in week 1 of maintenance (i.e., the participant responded to that stimulus when it was an S-); thereafter, that stimulus was not associated with incorrect responses. Thus, identification of these stimuli as being associated with false positives for acquisition did not lead to persistent errors following training. Altering the criteria to identify stimuli associated with false positives for acquisition may be necessary in future research and practice to avoid this issue.

As in Wong and Fienup (2022), mastery criteria of 100% correct responses across two consecutive sessions was applied to the set and target conditions in this investigation, whereas Wong et al. (2021) implemented mastery criteria of one session at 100%. The current findings support previous research suggesting more stringent mastery criteria lead to higher levels of response maintenance (Fuller & Fienup, 2018; Wong &

Fienup, 2022). More stringent mastery criteria likely increase the number of pairings of reinforcers with responses in the presence of relevant discriminative stimuli. Further, more stringent mastery criteria applied to stimulus sets might lead to overtraining as the already mastered targets are continually practiced and reinforced while the remaining target(s) in the set is mastered. Overtraining can strengthen stimulus classes (Bortoloti et al., 2013), but there are limited evaluations of overtraining on maintenance of skills (McDougale et al., 2020). Wong et al. observed somewhat lower levels of maintenance in the target condition relative to the set condition (that included overtraining) for two of their four participants when using less stringent mastery criteria, whereas results of Wong and Fienup and the present study suggest more stringent mastery criteria led to comparable maintenance for target and set conditions. Thus, any benefits of overtraining that occur in the set condition may be reduced when more stringent mastery criteria are applied to instruction. However, future research is needed to further evaluate the effects of more stringent mastery criteria on maintenance (Fienup & Carr, 2021).

Results of two participants in Experiment 1 differ from those of Wong et al. (2021) and Wong and Fienup (2022). Omar and Billy had minimal differences (i.e., difference of ≥ two sessions across conditions) in mastery of targets across conditions, suggesting either set or target mastery criteria can be used during tact instruction. Wong and Fienup found large differences in sessions to mastery across conditions, although these differences were typically produced by one or more targets delaying acquisition in the set condition. For Billy and Omar, there were no specific targets in the set condition that required extended training, which resulted in rapid mastery of sets of stimuli. In comparison, consistent errors (i.e., engaging in the same incorrect response across trials; Scott et al., 2021) was a pattern of responding that delayed other participants’ acquisition of specific targets in the set condition. For example, Josh had a consistent error that delayed acquisition in set 2 of the set condition. Consistent errors during instruction can delay acquisition when stimuli are grouped into sets, because one target may require many more exposures to prompts of the alternative response while the other targets in the set are quickly mastered. Although we used a common and empirically based method to select targets and assign them to sets in the set condition (i.e., logical analysis; Wolery et al., 2018), consistent errors to stimuli during prestudy probes is not a variable that is included in this method. Researchers and practitioners who plan to teach stimuli in sets could collect data on consistent errors that occur to specific stimuli during probes and consider either excluding those stimuli from any planned study comparison or using target mastery criteria to prevent overtraining of other targets in the set.

In previous experiments, participants learned textual responses to sight words, whereas the current investigation replicated these outcomes with tacts or intraverbals (Experiment 1) and AVCD (Experiment 2). Taken together, the current and previous studies suggest a mastery criterion applied to individual targets may increase the efficiency of instruction (by reducing sessions to mastery) for these three skills for some neurodivergent learners. Nevertheless, the ease with which the target mastery criterion can be implemented by practitioners is an important consideration. Tracking of mastery and replacement of mastered targets in practice would likely need to be done by Registered Behavior

Technicians^®(RBT^®) and behavioral technicians, which could be a cumbersome task requiring training to ensure integrity (Brand et al., 2019). In the current investigation, researchers closely monitored acquisition of targets and sets daily. In practice, it may be less likely to have daily oversight by supervisors. To increase the feasibility of implementation, practices may need to be modified to help RBTs^®and behavioral technicians identify the point of mastery of targets. Strategies to increase the feasibility and integrity of implementation of target mastery criteria is an important topic for future research.

References

Bergmann, S., Turner, M., Kodak, T., Grow, L. L., Meyerhofer, C., Niland, H. S., & Edmonds, K. (2021). Replicating stimulus-presentation orders in discrimination training. Journal of Applied Behavior Analysis, 54(2), 793–812. https://doi.org/10.1002/jaba.797

Bortoloti, R., Rodrigues, N. C., Cortez, M. D., Pimentel, N., & de Rose, J. C. (2013). Overtraining increases the strength of equivalence relations. Psychology and Neuroscience, 6(3), 357–364. https://doi.org/10.3922/j.psns.2013.3.13

Brand, D., Henley, A. J., DiGennaro Reed, F. D., Gray, E., & Crabbs, B. (2019). A review of published studies involving parametric manipulations of treatment integrity. Journal of Behavioral Education, 28(1), 1–26. https://doi.org/10.1007/s10864-01809311-8

Cariveau, T., Batchelder, S., Ball, S., & La Cruz Montilla, A. (2021). Review of methods to equate target sets in the adapted alternating treatments design. Behavior Modification, 45(5), 695–714. https://doi.org/10.1177/0145445520903049

Cariveau, T., & Fetzner, D. (2022). Experimental control in the adapted alternating treatments design: A review of procedures and outcomes. Behavioral Interventions. Advance online publication. https://doi.org/10.1002/bin.1865

Carr, J. E., Nicolson, A. C., & Higbee, T. S. (2000). Evaluation of a brief multiple-stimulus preference assessment in a naturalistic context. Journal of Applied Behavior Analysis, 33(3), 353–357. https://doi.org/10.1901/jaba.2000.33-353

Fienup, D. M., & Carr, J. E. (2021). The use of performance criteria for determining “mastery” in discretetrial instruction: A call for research. Behavioral Interventions, 36, 756–763. https://doi.org/10.1002/bin.1827

Fisher, W., Piazza, C. C., Bowman, L. G., Hagopian, L. P., Owens, J. C., & Slevin, I. (1992). A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilities. Journal of Applied Behavior Analysis, 25(2), 491–498. https://doi.org/10.1901/jaba.1992.25-491

Fisher, W. W., Kodak, T., & Moore, J. W. (2007). Embedding an identity-matching task within a prompting hierarchy to facilitate acquisition of conditional discriminations in children with autism. Journal of Applied Behavior Analysis, 40(3), 489–499. https://doi.org/10.1901/jaba.2007.40-489

Fuller, J. L., & Fienup, D. M. (2018). A preliminary analysis of mastery criterion level: Effects on response maintenance. Behavior Analysis in Practice, 11(1), 1– 8. https://doi.org/10.1007/s40617-017-0201-0

Grow, L. L., Carr, J. E., Kodak, T. M., Jostad, C. M., & Kisamore, A. N. (2011). A comparison of methods for teaching receptive labeling to children with autism spectrum disorders. Journal of Applied Behavior Analysis, 44(3), 475–498. https://doi.org/10.1901/jaba.2011.44-475

Knutson, S., Kodak, T., Costello, D. R., & Cliett, T. (2019). Comparison of task interspersal ratios on skill acquisition and problem behavior for children with autism spectrum disorder. Journal of Applied Behavior Analysis, 52(2), 355–369. https://doi.org/10.1002/jaba.527

Kodak, T., Clements, A., Paden, A. R., LeBlanc, B., Mintz, J., & Toussaint, K. A. (2015). Examination of the relation between an assessment of skills and performance on auditory-visual conditional discriminations for children with autism spectrum disorder. Journal of Applied Behavior Analysis, 48(1), 52–70. https://doi.org/10.1002/jaba.160

Kodak, T., Halbur, M., Bergmann, S., Costello, D. R., Benitez, B., Olsen, M., Gorgan, E., & Cliett, T. (2020). A comparison of stimulus set size on tact training for children with autism spectrum disorder. Journal of Applied Behavior Analysis, 53(1), 265–283. https://doi.org/10.1002/jaba.553

Lovaas, O. I. (1981). Teaching developmentally disabled children: The me book. University Park Press.

Maurice, C., Green, G., & Luce, S. C. (1996). Behavioral interventions for young children with autism: A manual for parents and professionals. PRO-ED.

McDougale, C. B., Richling, S. M., Longino, E. B., & O’Rourke, S. A. (2020). Mastery criteria and maintenance: A descriptive analysis of applied research procedures. Behavior Analysis in Practice, 13(2), 402–410. https://doi.org/10.1007/s40617-019-00365-2

Plaisance, L., Lerman, D., Laudont, C., & Wu, W. (2016). Inserting mastered targets during errorcorrection when teaching skills to children with autism. Journal of Applied Behavior Analysis, 2, 251–264. https://doi.org/10.1002/jaba.292

Richling, S. M., Williams, W. L., & Carr, J. E. (2019). The effects of different mastery criteria on the skill maintenance of children with developmental disabilities. Journal of Applied Behavior Analysis, 52(3), 707– 717. https://doi.org/10.1002/jaba.580

Scott, A. P., Kodak, T., & Cordeiro, M. C. (2021). Do targets with persistent responses affect the efficiency of instruction? Analysis of Verbal Behavior, 37, 217– 225. https://doi.org/10.1007/s40616-021-00163-4

Sindelar, P. (1985). An adapted alternating treatments design for instructional research. Education and Treatment of Children, 8(1), 67–76.

Sundberg, M. L. (2008). Verbal behavior milestones assessment and placement program: The VB-MAPP. AVB Press.

Wolery, M., Gast, D. L., & Ledford, J. R. (2018). Comparative Designs. In J. R. Ledford & D. L. Gast (Eds.), Single Case Research Methodology. Routledge. https://doi.org/10.4324/9781315150666-11

Wong, K. K., Bajwa, T., & Fienup, D. M. (2021). The application of mastery criterion to individual operants and the effects on acquisition and maintenance of responses. Journal of Behavioral Education. Advance online publication. https://doi.org/10.1007/s10864-020-09420-3

Wong, K. K., & Fienup, D. M. (2022). Units of analysis in acquisition-performance criteria for “mastery”: A systematic replication. Journal of Applied Behavior Analysis. Advance online publication. https://doi.org/10.1002/jaba.915

Received December 15, 2021
Final acceptance June 16, 2022
Action Editor, Daniel Fienup

Functional Communication Training: From efficacy To Effectiveness

Jan 5, 2026

Mahshid Ghaemmaghami and Gregory P. Hanley
Department of Psychology, Western New England University and FTF Behavioral Consulting

Joshua Jessel
Department of Psychology, Western New England University and FTF Behavioral Consulting and Department of Psychology, Queens College

Abstract

Functional communication training (FCT; Carr & Durand, 1985) is a common function-based treatment in which an alternative form of communication is taught to reduce problem behavior. FCT has been shown to result in substantial reductions of a variety of topographically and functionally different types of problem behavior in children and adults (efficacy). The extent to which these reductions maintain in relevant contexts and result in meaningful changes in the lives of those impacted (effectiveness) is the focus of this paper. This review evaluates the degree to which FCT has been established as an evidence-based practice in psychology (EBPP) according to the definition set out by the American Psychological Association’s 2005 Presidential Task Force on Evidence-Based Practice. Our review finds overwhelming evidence in support of FCT as an efficacious treatment but highlights significant limitations in support of its effectiveness. In order to also be recognized as an EBPP, future research on FCT will need to focus more closely on issues related to home, school, and community application, feasibility, consumer satisfaction, and more general and global changes for the individual.

Keywords: efficacy, effectiveness, functional communication training, generality, social validity

This study was conducted in partial fulfillment of a Ph. D. in Behavior Analysis from Western New England University by the first author. We thank Amanda Karsten, Jessica L. Sassi, Rachel Thompson, and Carolynn Kohn for their feedback on earlier versions of this manuscript.

[Correction added on 6 October 2020, after first online publication: The text citation for the reference ‘Correa et al., 2019’ has been corrected from ‘Corea’ to ‘Correa’.]

Address correspondence to: Mahshid Ghaemmaghami, FTF Behavioral Consulting 40 Southbridge St., Suite 202, Worcester, MA 01608. Email: dr.g@ftfbc.com doi: 10.1002/jaba.762
© 2020 Society for the Experimental Analysis of Behavior (SEAB)

Problem behavior is an important public health concern due to its debilitating effects on the lives of the individuals exhibiting problem behavior and those around them. Behavioral science has led to the emergence of powerful technologies and procedures that have the potential to significantly improve the developmental and social trajectory of these individuals. For this potential to be realized, it is important to continuously monitor and reflect on the strength of evidence of routinely implemented behavioral technology. In general, a treatment literature begins with evaluations that determine whether a procedure can produce an effect. When effects are demonstrated and repeatedly replicated, evaluations progress to demonstrating that these effects can be maintained, transferred, and extended. The treatment literature on applied behavior analytic treatments for problem behavior has reached the point where it includes many demonstrations of positive, large effects, particularly with functional communication training (FCT), which is the most widely used and researched function-based behavioral treatment (Greer et al., 2011; Rooker et al., 2013; Tiger et al., 2008). The current review is focused on examining the extent to which the literature on FCT has demonstrated these treatment effects can be maintained, transferred, and extended.

Functional Communication Training

FCT is a function-based treatment that involves teaching an alternative, and more appropriate, functional communication response (FCR) that results in the delivery of the same reinforcer(s) purported to be maintaining problem behavior (Carr & Durand, 1985; Durand & Moskowitz, 2015). Tiger et al. (2008) identified three stages of FCT: (a) a functional assessment process including a functional analysis to identify the reinforcement contingency maintaining problem behavior, (b) the development of a socially acceptable FCR using the functional reinforcer, and (c) the extension of treatment to other settings and caregivers.

The success of FCT in reducing problem behavior coupled with its emphasis on teaching communication has made it a popular treatment for individuals with various intellectual and developmental disabilities. In the 35 years since the introduction of FCT by Carr and Durand (1985), FCT has been used successfully to treat various topographies of socially maintained problem behavior in adults and children (Durand & Moskowitz, 2015; Tiger et al. 2008). Functions treated using FCT include isolated contingencies (e.g., Fisher et al., 2000; Hagopian et al., 1998; Kurtz et al., 2003; Wacker et al., 2005) and synthesized contingencies (e.g., Bowman et al., 1997; Ghaemmaghami et al., 2016; Hagopian et al., 2007; Hanley et al., 2014; Sarno et al., 2011). The topographies treated include severe forms of problem behavior such as aggression, self-injury, and property destruction (e.g., Kurtz et al., 2011), and less severe forms such as autistic leading (Carr & Kemp, 1989), psychotic speech (Durand & Crimmins,1987), off-task behavior (Flood & Wilder, 2002), and ritualistic behavior (e.g., Rispoli et al., 2014). FCT has also been found suitable to be applied in conjunction with augmentative and alternative communication techniques (e.g., signs, communication boards; Mirenda, 1997).

The initial success of FCT appears to depend on a variety of factors (see Tiger et al., 2008, for a detailed review). These factors include the extent to which appropriate motivating operations are identified and contrived, the extent to which relevant reinforcers are provided contingent on the FCR instead of problem behavior, and the relative efficiency of the initial FCR. Extinction also appears to be a critical component of FCT (Fisher et al., 2000; Hagopian et al., 1998; Kurtz et al., 2011; Rooker et al., 2013). The inclusion of schedule thinning is another essential component of FCT (Durand & Moskowitz, 2015; Kurtz et al., 2011; Tiger et al., 2008). When delays to reinforcement are introduced, however, FCT with extinction often fails (Fisher et al., 2000; Hagopian et al., 1998; Hanley et al., 2001, Rooker et al., 2013) unless certain strategies are implemented as part of thinning reinforcement (see Hagopian et al., 2011, for a detailed review). Some of these strategies include establishing discriminative control of FCRs via multiple schedules (e.g., Greer et al., 2016; Hanley et al., 2001), chained schedules or demand fading (e.g., Falcomata et al., 2013; Lalli et al., 1995), and contingency-based delay tolerance training (e.g., Ghaemmaghami et al., 2016; Hanley et al., 2014; Jessel et al., 2018; Rose & Beaulieu, 2018).

Supplementary procedures have been added to FCT either before or during reinforcement thinning (Hagopian et al., 2011; Kurtz et al., 2011; Rooker et al., 2013). The FCT treatment algorithm provided by Kurtz et al. (2011) includes the addition of other treatment components to FCT as necessary. Punishment for problem behavior was previously the most common procedure added to FCT (Hagopian et al., 2011; Hagopian et al., 1998; Kurtz et al., 2011; Kurtz et al., 2003), but this has more recently shifted toward the use of additional reinforcement components (e.g., Greer et al., 2016; Hagopian et al., 2005; Kurtz et al., 2003; Rooker et al., 2013). Rooker et al. (2013) recently reported the addition of noncontingent reinforcement (NCR), differential reinforcement of alternative behavior (DRA), or differential reinforcement of omission of problem behavior (DRO) to FCT to be more efficacious than the addition of punishment in their inpatient applications. Following the failure of FCT with extinction, the addition of alternative reinforcement resulted in a 90% reduction in 71% of application, whereas, the addition of punishment resulted in a 90% reduction in 54% of the applications. Hagopian et al. (2005) found that when thinning the schedule of reinforcement to practical levels, continuous and noncontingent access to stimuli that compete with the functional reinforcer is particularly useful in enhancing tolerance for long delays.

Efficacy versus Effectiveness of FCT

The terms effective and efficacious are often used interchangeably, but are, in fact, defined distinctly by the psychological and medical community (American Psychological Association, 2006). Treatment literature is often evaluated in terms of two attributes: efficacy and effectiveness (Glasgow et al., 2003; Hoagwood et al., 1995; Marchand et al., 2011; Smith et al., 2007). Efficacy relates to the demonstration of causal relations between the treatment in question and the change in behavior occurring under tightly controlled conditions. Effectiveness, on the other hand, relates to the clinical utility of treatment, its feasibility, generality, acceptability, and cost-effectiveness. In other words, efficacy is the potential of the treatment demonstrated under highly consistent conditions and strong experimental controls, whereas effectiveness is the actual effect of treatment in practice, where implementation conditions vary and multiple factors that may moderate the effects of the treatment are uncontrolled (Singal et al., 2014).

In general, there are some specific characteristics associated with studies examining efficacy versus effectiveness. Efficacy research tends to be conducted with relatively homogeneous populations, in highly controlled research settings, using single component or carefully described multicomponent treatments that are implemented with high integrity by skilled researchers for a short period of time (Singal et al., 2014). The outcome measures are often direct measures of the specific behavior treated, and the effects are demonstrated using strong experimental designs with high internal validity. Effectiveness research, by contrast, is the application of treatment with relatively heterogeneous populations, in typical settings where these treatments occur (e.g., schools, homes), using multicomponent treatments implemented with various ranges of integrity by caregivers, for typically much longer periods (Singal et al., 2014). The outcome measures include both direct and indirect assessments of (a) the specific target behavior; (b) social acceptability and adoptability of the intervention by the critical stakeholders; (c) cost-effectiveness; and (d) the more general impact of the intervention on adaptive functioning and the individual’s overall quality of life.

The ultimate goal of applied research is to demonstrate that treatments derived from well-controlled efficacy studies remain useful in naturalistic studies at the level of clinical service, in order to establish the overall effectiveness of the intervention (Glasgow et al., 2003; Hoagwood et al. 1995; Marchand et al., 2011; Smith et al., 2007). For example, a treatment resulting in 90% reductions in tantrums or head-butting may be considered efficacious, but not effective if the change is not considered socially valid for that individual in his social environment (Wolf, 1978). Thus, quantified reductions in problem behavior identify the extent to which a treatment is efficacious, but are limited as a measure of the importance of the improvement which should be “large enough effects for practical value” (Baer et al., 1968, p. 96). Meeting such a goal requires valid research with emphasis on the social and ecological validity of the intervention, its procedures, and its effects.

The efficacy of FCT with extinction, under tightly controlled conditions and rich reinforcement schedules, has been demonstrated for a variety of topographically and functionally different problem behaviors, with a diverse range of participants varying in age, language and intellectual abilities, and comorbid diagnoses. Using the criteria set out by Division 12 of APA (Task Force, 1995) for empirically supported treatments (ESTs), Kurtz et al. (2011) found FCT with extinction far exceeds the criteria for a well-established treatment for socially maintained problem behavior of children with ID and ASD, and is probably efficacious for adults. Briefly, APA’s EST criteria are largely concerned with the demonstrated efficacy of a treatment, as they consider the number of methodologically rigorous studies demonstrating an intervention is efficacious. Durand and Moskowitz (2015) further concluded that FCT exceeds the American Psychological Association’s criteria for being designated a well-established treatment for the problem behavior of children with ID and other DDs including ASD. Despite the strong support for the efficacy of FCT, questions remain regarding its effectiveness.

Various models have been proposed for the evaluation of efficacy and effectiveness of psychological treatments. Some researchers advocate for phase models that view efficacy and effectiveness in a linear fashion and as distinct, and often opposite, phases of a complete evaluation. An intervention’s efficacy is evaluated and confirmed before a question of effectiveness can be answered (Chambless &

Hollon, 1998; Glasgow et al., 2003; Smith et al., 2007). Others hold a more flexible and bidirectional view of the relation between these constructs along a continuum that can be evaluated within the same study (Carroll & Rounsaville, 2003; Glasgow et al., 2006; Hallfors and Cho, 2007; Hoagwood et al., 2001; Hoagwood et al., 1995; Zwarenstein & Treweek, 2009; Zwarenstein et al., 2008). Inclusion of issues related to dissemination at the outset of the testing process may lessen the gap between research and practice caused by the overemphasis on efficacy concerns, resulting in treatments that are both efficacious and effective and ready for largescale dissemination and consideration as an evidence-based practice in psychology (EBPP). This continuous and interactive view of efficacy and effectiveness is the model applied in this review.

The purpose of this review was to examine whether sufficient evidence exists with regard to the effectiveness of FCT such that we can establish FCT as an EBPP. We conducted a quantitative review of the literature to evaluate the strength of empirical evidence in support of the effectiveness of FCT in terms of the extent to which general (i.e., beyond the direct effects on the target behavior) and socially valid changes (i.e., change meaningful to relevant stakeholders) occur and maintain as a function of FCT in relevant natural contexts under manageable schedules of reinforcement. The specific questions investigated are the extent to which (a) large effects on target problem behavior have been obtained under rich and lean reinforcement schedules, with generality and longterm maintenance, and when implemented by caregivers in relevant contexts; (b) caregivers, behavior change agents and direct and indirect recipients of this treatment have socially validated the procedures and effects; and (c) secondary and more general effects of FCT have been evaluated and global improvements in functioning without severe side-effects have been demonstrated. The cost-effectiveness and feasibility of FCT and recommendations for future research are also discussed throughout.

Method

FCT studies were identified through a computer search of PsycINFO, PubMed, and Google Scholar using the keywords functional communication training and functional equivalence training from 1985 to 2019. Additional articles were found through an examination of the reference lists of the identified FCT articles and reviews. The identified studies, and each individual FCT application within the study, were reviewed to determine those that met criteria for inclusion in this review.

Inclusion and Exclusion Criteria

Studies were included if they (a) were published in an English language peer-reviewed scholarly journal; (b) included an application of FCT as a component of treatment for one individual, topography of problem behavior, or behavior function; and (c) included graphic or aggregate pretreatment and posttreatment evaluation data on problem behavior for each individual application. Application refers to a single treatment evaluation of FCT for a participant’s problem behavior from start to finish. In other words, all phases (e.g., schedule thinning phase, maintenance, treatment extension to other settings, tasks, or people) and comparisons (e.g. FCT with or without punishment, comparisons of mand modalities) used to identify the most successful treatment arrangement or combination were counted as one application. A new application of FCT was counted if different topographies or functions of problem behavior were treated separately. Relying on information provided by the authors and our own cross checking across articles, duplications of FCT applications across different studies were only counted once.

Coding Procedures

Studies that met the inclusion criteria were further evaluated to identify all applications of FCT. Each individual FCT application was analyzed for the following characteristics: (a) implementation context; (b) type of reinforcement schedule, (c) long-term maintenance, (d) generality, (e) social validation of procedures and effects, (f) positive secondary effects, (g) negative side effects and (h) general effects on functioning. The operational definitions of each characteristic coded are summarized in Table 1. For all items, data were also collected on whether positive effects were obtained. For items (a) to (d) positive effects were defined as large (i.e., 80% or more) reductions of baseline levels of problem behavior as reported by the authors (if available), or determined by using the average rate of combined problem behavior during the last three sessions of the first baseline and the last three sessions of the treatment phase in question. For the remaining items, positive effects were defined as either high levels of acceptability (e), any improvements from baseline (f and h), or absence of worsening (g).

Table 1

Operational Definitions of the Characteristics Coded for Each FCT Application

Coding Characteristic	Definitions
Expert Implementation Caregiver Implementation Analogue Settings Relevant Settings Relevant Contexts	Experimenters and researchers, graduate, undergraduate students or therapists working as part of the research team, who were not the typical caregivers of the individual. Parents, teachers, direct-care staff, and behavioral therapists and staff who regularly work with the individual, with or without experts present. Includes applications that started with caregiver implementation and those that were transferred to caregivers in relevant settings. Tightly controlled environments such as in-patient and outpatient hospital units, session rooms in universities, segregated rooms in schools, and specialized clinics for assessment and treatment of problem behavior Settings the individual is typically in and in which problem behavior was originally reported to occur. Situation in which problem behavior was originally reported to occur and which typically includes lean and unpredictable schedules of reinforcement
Rich SR Schedules Schedule Thinning	CRF schedule of reinforcement for the FCR following less than three demands or less than three seconds of delay. Procedures that thin the reinforcement schedule to any level beyond the CRF schedule noted above
Long-Term Maintenance	Treatment was implemented for an extensive period of six months or longer with continuous data collection or post-treatment follow-up probes. (Large effects were determined using the last phase).
Generality	Treatment was extended beyond the original teaching context, across tasks, people, and/or settings, with or without an analysis of generalization. (Large effects were determined across all secondary contexts evaluated).
Social Validation of Procedures	Acceptability of and satisfaction with overall procedures, or its components are measured either directly (e.g., concurrent chains arrangements) or indirectly (e.g., questionnaires, interviews) with corresponding data provided. This validation may have been provided by the client, direct caregivers, parents, or other stakeholders.
Social Validation of Effects	Acceptability and confirmation of meaningfulness of effects are measured either directly (e.g., scoring of sessions for severity and safety) or indirectly (e.g., questionnaires, interviews). This validation may have been provided by the client, direct caregivers, parents, or other stakeholders.
Positive Secondary Effects	Collateral effects of treatment increasing other adaptive behavior and/or reducing other non-target problem behavior, which were directly measured.
Negative Side-Effects	Collateral effects of treatment worsening existing adaptive behavior and/or increasing other non-target problem behavior, which were directly measured.
General Effects	Global effects on adaptive and play skills, IQ, language, diagnostic characteristics, or symptom severity are evaluated using standardized measures. General effects on family stress and functioning, community involvement, and overall quality of life are measured using either direct or indirect measures.

Note. FCT = Functional Communication Training, CRF = Continuous reinforcement, IQ = Intellectual Quotient

Interrater Agreement

A second reader independently coded at least 35% of the articles and scored the FCT applications along the characteristics noted above. An item-by-item agreement was then calculated across the two application score sheets, with agreement being defined as both readers recording the same value for each characteristic. Interrater agreement averaged 98% (range, 89% to 100%) across applications.

Results and Discussion

A total of 208 empirical studies of FCT, published between 1985 and 2019, were included in this review. A total of 744 applications of FCT across 640 participants were identified. Figure 1 depicts the percentage of applications that show positive effects along the efficacy and effectiveness continuum. The size of each pie corresponds to the proportion of applications in each category out of 744 applications. Studies which were sourced for this review for analysis of data but were not individually cited in the body of the paper are listed in the Supplemental document (studies cited in the text are listed in the References).

Large Effects on Target Problem Behavior

FCT results in a substantial reduction of problem behavior (reductions of 80% or more) in 90% of applications (668 out of 744) when it is first introduced. In addition, 57% (421 out of 744) of the applications were implemented by experts in highly controlled analogue settings under rich reinforcement contexts (Figure 1). Conversely, 9% (65 out of 744) of the applications were implemented by experts in relevant settings under rich reinforcement schedules and 94% of these have also shown large effects. Thus, there is strong evidence of efficacy, in line with previous reviews of FCT (Heath et al., 2015).

Schedule thinning, an essential component of FCT, was only included in 40% (294 out of 744) of the applications (Figure 1) which is similar to previous reviews. For example, Hagopian et al. (2011) found that only 19 out of 76 (i.e., 29%) studies published between 1985 and 2009 included a schedule thinning phase, and Kurtz et al. (2011) found that schedule thinning was only included in 32% of the 106 applications. Large effects were obtained in 90% (265 out of 294) of the applications with schedule thinning (Figure 1). Most of the applications with schedule thinning, however, were implemented by experts (66%), and required the addition of supplemental procedures. This finding is similar to Kurtz et al. (2011) in which only 38% of successful applications with schedule thinning were done with FCT and extinction alone.

Figure 1: Percentage of Functional Communication Training (FCT) Applications 1985-2019 With (Black Portion) and Without (White Portion) Large Positive Effects

Some researchers have recognized the importance of identifying thinning procedures that increase the effectiveness of FCT under more typical reinforcement schedules and have provided various recommendations for increasing the practicality of FCT (See Hagopian et al., 2011 for more details). Many researchers have used time-based arrangements using discriminative stimuli such as multiple schedules to incorporate periods of nonreinforcement of the communication response with or without response restriction (e.g., Fisher et al., 2014; Fisher et al., 1998; Greer et al., 2016; Greer et al., 2019; Hanley et al., 2001; Kuhn et al., 2010; Roane et al., 2004), others have used time-based mixed schedules such as delay schedules (e.g., Fisher et al., 2000; Hagopian et al., 1998; Rooker et al., 2013), demand fading in a chained schedule arrangement (e.g., Berg et al., 2007; Lalli et al., 1995; Wacker et al., 1998; Wacker, Lee, et al., 2013; Zangrillo et al., 2016), and contingency-based delay tolerance training (e.g., Carr & Carlson, 1993; Hanley et al., 2014; Jessel et al., 2018; Rose & Beaulieu 2018).

Despite the large reductions in problem behavior, complete elimination of problem behavior is difficult to achieve, especially under lean reinforcement schedules. Problem behavior was eliminated in 53% (398 out of 744) of the applications. For example, problem behavior was eliminated in 50% of the FCT applications in Wacker et al. (1998) and in less than 50% of the applications in Greer et al. (2016), Wacker et al. (2005), and Wacker, Lee, et al. (2013). All topographies of problem behavior were not eliminated in any applications by Kurtz et al. (2003) and self-injury was only eliminated when punishment was added to FCT. Rooker et al. (2013) found that problem behavior was eliminated in 19% of FCT applications with immediate reinforcement. Using discriminative stimuli (i.e., multiple schedules, response restriction, or chained schedules), Greer et al. found that problem behavior was eliminated in 40% of the applications with FCT alone.

In conclusion, FCT has resulted in large initial effects in 90% of the applications but eliminated problem behavior in only 53% of the applications. The majority of successful FCT applications (51%) have been implemented by experts in analogue settings under rich reinforcement schedules. Large effects under lean reinforcement have only been shown in 36% of FCT applications and usually only when supplemental procedures such as punishment (e.g., Fisher et al., 1993, Fisher et al., 2000) or competing reinforcement (e.g., Hagopian et al., 2005; Rooker et al., 2013) are added to FCT with extinction, or when multiple schedules (e.g., Fisher et al., 1998; Greer et al. 2016) or contingencybased delays (e.g., Ghaemmaghami et al., 2016; Jessel et al., 2018) or multicomponent interventions that include removal of the precipitating challenging situation via antecedent-based procedures (e.g., Carr et al., 1999; Kemp & Carr, 1995) are used.

In order to determine if sufficiently large effects on problem behavior are obtained using FCT, more research is needed. We do not yet have a reliable and systematic technology for determining the practical impact of an effect. Although, we can measure the degree to which a quantifiably large effect has been achieved, the extent to which it is a “large enough effect for practical value” (Baer et al., 1968, p. 96) is often not determined. One avenue to explore would be to examine the use of pre- and posttreatment questionnaires that identify the consumer’s desired outcome, and their subsequent satisfaction with the outcomes along with the procedures. Descriptive information about the normative levels of problem behavior and functional repertoires in well-adjusted children and adults of various ages could help to establish a socially anchored success criterion.

More research is needed to determine when elimination of problem behavior should be the goal and the strategies that facilitate achieving this goal. Elimination of problem behavior is likely a goal for at least some recipients of FCT, but this is not achieved in many cases when using immediate reinforcement of FCRs. Slaton et al. (2017) demonstrated that complete elimination of problem behavior may not be realized with FCT when co-occurring reinforcement contingencies are isolated, as some unaddressed establishing operation may result in the persistence of residual problem behavior. Synthesizing the co-occurring contingencies in the functional analysis and subsequent treatment resulted in immediate and complete elimination of problem behavior in all cases as compared to only 50% of cases when isolated contingencies were used. The relative advantages of synthesized contingencies when analyzing and treating problem behavior have been demonstrated several times subsequently (Mitteer et al., 2019; Slaton & Hanley, 2018; Zangrillo et al., 2016). More research on the relative effects of isolated versus synthesized contingencies on FCT outcomes is warranted, but reliance on isolated reinforcement contingencies should be reconsidered in light of these ^findings.

Additional research should also examine ways to consistently achieve the goal of elimination of problem behavior under leaned reinforcement schedules. Future studies might report separately the effects of FCT under rich versus lean schedules of reinforcement and across schedule thinning procedures. Because typical environments include sudden interruptions and unplanned denials of requests, procedures are needed for teaching tolerance for these events. The extent to which the various reinforcement thinning procedures are successful during unexpected interruptions remains to be determined.

One promising procedure designed explicitly for successful transfer to these natural contexts is the unpredictable contingency-based delay procedures described by Hanley et al. (2014) and replicated by Santiago et al. (2016). An intermittent and unpredictable reinforcement schedule is used. Some proportion of FCRs are reinforced immediately while others are followed by a brief delay signal (e.g., “Wait”). The learner is then taught to engage in progressively more complex and varied chains of contextually appropriate behavior before the reinforcer is delivered. Ghaemmaghami et al. (2016) showed that the response contingency during the delay interval was the factor driving the maintenance of near-zero rates of problem behavior and optimal rates of communication and other adaptive tolerance responses, especially as the delay intervals get longer. The extent to which the predictability and the variability of the response chain play a role in the success of this procedure remains to be evaluated.

Maintenance Over an Extensive Period of Time

Maintenance of effects can be evaluated as long-term implementation of FCT without a clear break between intervention and follow-up (e.g., Northup et al., 1994) or as posttreatment follow-up probes (e.g., Moes & Frea, 2002). Long-term maintenance was defined in this review as treatment being implemented for a period of 6 months or more, and was found in 11% (79 out of 744) of the applications (Figure 1). Only a small percentage of studies include data on the long-term maintenance of effects and only demand fading (e.g., Wacker et al., 2011; Wacker, Harding et al., 2013) and contingency-based delay tolerance training (e.g., Carr et al., 1999), have been evaluated for their long-term maintenance of effects.

The results for the long-term maintenance of effects are mixed but mostly show that reductions in problem behavior in the original training context maintain over time. Large effects were maintained in 82% (65 out of 79) of the applications (Figure 1). Although recovery to baseline levels of problem behavior are seldom reported, some problem behavior recovers over time, sometimes within a few weeks (e.g., Bailey et al., 2002).

Extensive long-term elimination, or near elimination, of problem behavior is often demonstrated in studies that combine FCT with other strategies in addition to extinction. For example, Carr et al. (1999) used an intervention package of FCT with choice-making, embedding demands, and contingency-based delays, and produced long-term maintenance of effects, including complete elimination of problem behavior for two out of three participants during the last 2 years of follow-up. Derby et al. (1997) combined FCT with punishment to achieve near elimination of problem behavior during the last year of follow-up. Moes & Frea (2002) used an FCT package that included parent training, inclusion of idiosyncratic family routines, extended family support, and full involvement of all family members to eliminate problem behavior. Maintenance of near-zero rates of problem behavior occurred for up to a year. The extent to which a treatment that includes only FCT, extinction, and a schedule thinning procedure without supplemental procedures would result in long term elimination of problem behavior remains largely unknown.

Future research on FCT should include examination of long-term maintenance as a standard part of the design. In addition, the specific variables that directly impact maintenance need to be identified. Lack of maintenance may reflect context renewal and resurgence (Bouton et al., 2012; Kelley et al., 2015), both of which are problematic when treatment integrity is diminished. Poor treatment integrity may be a result of low acceptability of procedures (e.g., Durand & Kishi, 1987; Steege et al., 1990;), poor recognizability of the FCR (Durand & Carr, 1991), or excessively high or diminished rates of the FCR (e.g., Northup et al., 1994). More research is needed to identify the roles of participant characteristics, family needs or strengths, caregiver values and concerns, and terminal schedules of reinforcement in the maintenance of effects with FCT.

Identifying variables that lead to better maintenance may also help researchers identify specific procedures for successful transfer of treatment effects to relevant settings. For example, Luczynski et al. (2014) found that generalization and maintenance of functional communication and self-control skills depended on informing children’s teachers of the specific target skills and the importance of their intermittent reinforcement in the classroom. It is probable that maintenance of successful effects depends not only on teaching the individual the skills of communication and toleration but also alerting people in the individual’s social context of the importance of reinforcement of these repertoires.

Transfer and Generalization to Other Contexts

Although FCT has the advantage of teaching behaviors that will access natural contingencies of reinforcement (Stokes & Baer, 1977), various studies have shown that generalization cannot be assumed (e.g., Horner & Budd, 1985; Olive et al., 2008; Rispoli et al., 2014; Schindler & Horner, 2005). Yet, evaluation of generality (i.e., treatment extension beyond the original teaching context) and generalization are not common in the FCT literature. Generality of treatment was evaluated in 24% (181 out of 744) of the applications (Figure 1); 76% of which showed large effects in secondary contexts. An analysis of generalization, however, has only been conducted in a handful of studies (e.g., Berg et al., 2007; Wacker et al., 2005).

The most commonly applied generalization tactic was sequential modification (Stokes & Baer, 1977) in which new tasks, people, or settings are successively introduced (e.g., Carr et al., 1999; Hanley et al., 2014; Rispoli et al., 2014). Other researchers have more directly programmed for generalization by incorporating multiple exemplars (Stokes & Baer, 1977) through the use of multiple therapists and task materials from the beginning (e.g., Durand & Carr, 1991; Moes & Frea, 2002), programming like stimuli (Stokes & Baer, 1977) such as an augmentative speech device or a discriminative stimuli that can be transferred to new contexts (e.g., Durand, 1999; Olive et al., 2008), by initiating FCT in typical environments (e.g., Campbell & Lutzker, 1993; Santiago et al., 2016), or by incorporating familiar people or tasks from the typical environment into the training sessions (e.g., Hanley et al., 2014; Kemp & Carr, 1995). For example, Fisher et al. (2015) and Greer et al. (2019) have shown that the use of schedule-correlated stimuli under a multiple schedule arrangement may result in rapid transfer of direct effects of treatment to subsequent contexts with rich and lean reinforcement schedules.

The emergence of generalized responding is often observed following the addition of secondary or tertiary teaching materials (e.g., Mancil et al., 2009; Olive et al., 2008) or when a combination of generalization and social validity tactics are used (e.g., Durand, 1999; Durand & Carr, 1991; Hanley et al., 2014; Kemp & Carr, 1995). For example, Moes and Frea (2002) demonstrated generalization only following contextual modifications to FCT to incorporate idiosyncratic family situations such as caregiving demands, family support, and social interactions associated with each routine, as well as family values and goals, into the FCT procedures. Ghaemmaghami et al. (2016) showed that in addition to the use of multiple exemplars of antecedent conditions and delay cues, the type of reinforcement thinning procedure used may also impact generalization. Generalization occurred during contingency-based, but not during time-based, delay tolerance training.

Although not definitive, there is some evidence indicating generalization may be more frequently achieved across people and settings relative to tasks and activities (e.g., Berg et al., 2007; Wacker et al., 2005). Using multiple tasks and activities within multiple exemplar training appears important to increase the odds of generalization (e.g., Ghaemmaghami et al., 2016).

In conclusion, the generality of FCT has been evaluated in 24% of the applications, and large effects have been obtained in generality contexts in 76% of these cases. In addition, large effects have been obtained in 94% of the applications that include an evaluation of generality with long-term maintenance under lean schedules of reinforcement, but only 2% of applications include such an evaluation (Figure 1). Finally, these general effects have been obtained following the implementation of treatment in a tightly controlled environment and with deliberate programming (e.g., multiple exemplar training and sequential modification).

More research on efficient processes for achieving generality of the effects of FCT is needed. Although acquisition may be enhanced by relying on tightly controlled contexts, the extent to which this advantage is mitigated by difficulties with extending the effects of treatment outside that context should be examined. The demonstrations of generality that exist (e.g., Durand & Carr, 1992) have mostly been conducted under rich schedules. A few studies have evaluated generalization of FCT effects with various schedule thinning procedures (e.g., contingency-based and time-based delay, Ghaemmaghami et al., 2016; demand fading, Berg et al., 2007; Wacker et al., 2005; multiple schedules, Fisher et al., 2015; Greer et al., 2019), but more research is needed to clarify the type of reinforcement thinning procedures and supplemental procedures that should be added to FCT (e.g., schedulecorrelated stimuli, punishment, tokens, timers), and whether relying on functional mediators (Stokes & Osnes, 1989) enhances generalization of FCT treatment effects. In particular, the extent to which various reinforcement thinning procedures will result in less discriminable contingencies and the extent to which this increases generalization remains to be determined. Finally, assessments of the generalization of direct and indirect effects of FCT should be incorporated into the experimental designs of studies on FCT with comparisons of pretreatment and posttreatment probes. Problem behavior may not occur in all contexts nominated by caregivers, thus, the absence of problem behavior during posttreatment extension probes may reflect the absence of an establishing operation rather than a demonstration of generalization (Berg et al., 2007).

Effects of FCT with Caregivers in Relevant Settings

Caregiver implementation of FCT in relevant settings has been evaluated in 36% of the applications (265 out of 744) with 87% showing large effects (Figure 1). FCT has been implemented by parents, teachers, job coaches, and direct-care staff at homes, schools, vocational, and community settings. Some caregivers initially observe while the effects of treatment are evaluated by the research team and then implement the procedure in the natural environment with on-site coaching and feedback from the researchers (e.g., Campbell & Lutzker, 1993; Hanley et al., 2014; Kurtz et al., 2015). Others start with caregiver training and implementation in the natural environment (e.g., Carr & Carlson, 1993; Carr et al., 1999; Santiago et al., 2016; Wacker et al., 1998). More recently, Wacker, Lee et al. (2013) evaluated a training model in which parents initiated and delivered FCT treatments in a clinic with onsite support from a parent assistant and telehealth coaching by a behavior analyst. An average of 94% reduction in problem behavior of 17 children with ASD was obtained, although no data on FCRs and other collateral responses (e.g., task completion) were provided. Suess et al. (2014) also found acceptable levels of caregiver treatment fidelity following training provided via Telehealth.

The effects of FCT may be more quickly achieved when treatment is initially implemented by skilled researchers and then transferred to caregivers. For example, Derby et al. (1997) had all FCT and supplemental procedures initiated by parents in the natural environment. It took an average of 90 hr over 6 months (daily 10 to 30 min FCT session over 2 to 7 months) for the simple FCR to become 100% independent under immediate reinforcement. By contrast, Hanley et al. (2014) assessed and treated problem behavior incorporating delay tolerance training to practical demand levels, and treatment extension to parents and to the natural environment in an average of 27 hr distributed over 2 to 3 months.

Although 87% of caregiver-implemented applications of FCT in relevant contexts show large effects, problem behavior is not always eliminated. In addition, the effects are often obtained under rich reinforcement schedules in highly planned and specific teaching contexts without documentation of maintenance. Only 4% of FCT applications include an evaluation of generality and long-term maintenance when implemented by caregivers (Figure 1), and only 70% of these applications showed large effects. Studies that do show elimination of problem behavior under more naturalistic routines with caregivers are often multicomponent (e.g., Carr et al., 1999; Kemp & Carr 1995) rendering the importance of FCT unclear.

Although studies illustrate the efficacy of some parent training methods for delivering FCT (e.g., Wacker, Lee, et al., 2013; Suess et al., 2014), comparative studies of the efficacy and acceptability of the methods have not been conducted. Comparisons of parent training approaches might examine the speed and accuracy of the identification of functional reinforcers and teaching of communication skills, the maintenance and generalization of effects, and the accuracy with which parents can arrange highly motivating evocative contexts. A cost–benefit analysis could be conducted for transferring implementation from the behavior analyst to the caregivers at the beginning of the process as opposed to the end and for a hybrid model (e.g., Campbell & Lutzker, 1993) in which caregivers are involved to some degree at every step. Finally, the effects of FCT with reinforcement thinning procedures as implemented by caregivers in relevant environments should be evaluated over a long period of time.

Social Validation of FCT Procedures and Effects

Wolf (1978) argued that issues of social validity are important from an ethical perspective and may be related to the long-term effectiveness of an intervention. The effects of treatment, generalization, and maintenance of FCT may be impacted by the acceptability of the procedures and effects (Moes & Frea, 2002; Northup et al., 1994; Steege et al., 1990). The majority of FCT studies, however, do not include any measures of social validity of the procedures, the effects, or both. Only 190 applications out of 744 (i.e., 26%) reported on social validity of the procedures with 97% reporting satisfaction. Only 56 (i.e., 8%) reported on the social validity of the effects (Figure 1) and 100% were satisfied. Finally, only 40 (i.e., 5%) reported on the validity of both procedures and effects and 100% were satisfied. The extent to which this may be an artifact of publication bias or “file drawer problem” remains unknown (Rosenthal, 1979).

To date, no researchers have identified the desired level of behavior change prior to the introduction of treatment. Additionally, the extent to which the degree of behavior change is meaningful and of practical value has only been evaluated in 8% (56 out of 744) of applications. Some questionnaires ask about hypothetical results (e.g., “Will this treatment be effective?”). Many questionnaires include questions about a change in the target behavior (e.g., “Did you observe behaviors improving as a result of the function-based treatment?” “How likely is treatment to improve behavior?”), but not whether the change was sufficient or of practical value. These social validity questions seem akin to indirect measures of the efficacy of FCT (i.e., there was a change) rather than measures of the social validity impact of the effects (i.e., the change was sufficient to result in a meaningful improvement in their quality of life).

A few studies have evaluated the acceptability of procedures and effects with caregivers (e.g., Hanley et al., 2014; Santiago et al., 2016; Jessel et al., 2018) and other stakeholders, such as vocational site employees and customers (e.g., Kemp & Carr, 1995), grocery store cashiers (e.g., Carr & Carlson, 1993), typical consumers such as parents (e.g., Dunlap et al., 2006), and group home staff (e.g., Carr et al., 1999). These studies include subjective assessments of the severity of the problem behavior exhibited by the participant in the relevant contexts and include questions pertaining to the comfort level and the willingness of the caregivers and stakeholders to interact with the participant following treatment.

Most studies include indirect measures of social validity in the form of questionnaires completed at the end of treatment. Some include measures of acceptability that are conducted before and after treatment (e.g., Hanley et al., 2014; Olive et al. 2008; Santiago et al., 2016). For example, Olive et al. (2008) asked the participant’s parent to complete The Behavioral Intervention Rating Scale (BIRS; Elliott & Treuting, 1997) before and after FCT implementation to measure the level of acceptability of procedures and effects and to identify any changes in the consumer’s attitude pre- and postintervention. There is, however, a lack of repeated assessment of social validity. Most assessments of social validity occur once immediately after treatment but are rarely repeated throughout the treatment and at long term follow-up.

Participants themselves have sometimes directly validated the treatment they have received (Hanley et al., 1997; Hanley et al., 2005). Hanley et al. (1997) directly assessed children’s preference for FCT, NCR, and extinction in a concurrent chains arrangement. Color-coded switches in the initial link were paired with 2 min of access to each treatment (FCT, NCR, extinction) in the terminal link. The cumulative initial link selections were used as a measure of preference for each treatment and children preferred FCT to both NCR and extinction. One study has directly evaluated caregiver preference for DRA treatments involving a communication response in place of problem behavior (i.e., FCT). Gabor et al. (2016) found that two out of five caregivers preferred DRA to DRO, NCR, or no treatment, two caregivers preferred differential reinforcement (either DRA or DRO) to NCR or no treatment, while one caregiver had no preference for any particular treatment but preferred these to no treatment.

In conclusion, despite the reported high acceptability of the procedures, there are some questions regarding the validity of the methods used to assess this acceptability (see Schwartz & Baer, 1991, for a more in-depth discussion of this issue), and limited evaluation of meaningfulness of the effects of these procedures. Thus, additional indirect and direct evaluations of the acceptability of the procedural variations of FCT are needed. In particular, researchers might examine participant and caregiver preferences for the various reinforcement thinning procedures. Although social validity questionnaires and interviews may overestimate consumer satisfaction (Schwartz & Baer, 1991) and may not be fully predictive of consumer behavior and long-term maintenance (Hawkins, 1991), these measures can nonetheless flag unsatisfactory procedures and effects. When detected, these findings provide an opportunity to address consumer concerns or to explore other approaches. Whether this may in turn enhance treatment implementation and outcome (Hawkins 1991; Schwartz & Baer, 1991) is an empirical question worthy of further investigation. In addition, researchers need more effective methods for identifying consumer preferences and satisfaction. Ultimately, procedures described by Welsh et al. (1994) in which continuous assessment, or survival probes, of maintained implementation of treatment by caregivers could be the measure used to determine acceptability of treatment.

Schwartz and Baer (1991), describe social validity assessment as a two-part process that includes using the information obtained from consumers to maintain and enhance the acceptability of treatment. Future research on FCT might examine the effects of a participatory action model (Correa et al., 2019; Fawcett, 1991; Ivankova, 2017; McCurdy et al., 2016; White, 2002) in which consumers of treatment are actively involved in the identification of treatment goads and outcomes and selection of treatment components. This model might be associated with higher treatment integrity as acceptability problems can be addressed at the beginning rather than discovered at the end. Thus far, only Moes and Frea (2002) have reported on how results of the social validity evaluations and parental values were incorporated into the selection of procedures and goals. Instead, social validity assessments have primarily been used to measure the extent to which caregivers approve of the procedures chosen by researchers and the effects of those procedures, which could lead to “false praise from consumers” (Schwartz & Baer, 1991, p. 191). Acceptability of treatment procedures without validation that the outcomes achieved are sufficient and meaningful to the consumer, or vice versa is inadequate.

Acceptability of treatment may also be influenced by its overall feasibility which is directly impacted by its cost-effectiveness and requires an evaluation of the personnel, timeframe, and monetary costs associated with the effects obtained. Currently, this information is mostly absent from the literature on FCT. As we move toward an evaluation of FCT effectiveness and feasibility, future researchers should aim to provide the information necessary for a cost–benefit analysis (see Hanley et al., 2014 and Wacker, Lee, et al., 2013 for examples).

Effects of FCT on Global Functioning and Quality of Life

Freedom from a lifestyle dictated by problem behavior depends not just on the reduction of problem behavior via FCT but also on the extent to which restrictive lifestyles due to chemical and physical restraints, and seclusion and exclusion from community activities, and the negative impact on the overall family system are reduced or eliminated with this treatment (Fox et al., 2002). These secondary effects of treatment on functioning and quality of life are a critical part of the overall effectiveness of an intervention (Smith, 2012; Smith et al., 2007).

Secondary positive effects were evaluated in 15% (116 out of 744) of FCT applications, with 97% of these showing a positive effect on various adaptive responses or collateral reductions in other nontarget problem behavior (Figure 1). Adaptive responses included improved task engagement, play, social interaction, and spontaneous communication following FCT implementation. These positive effects are further increased when skill-based reinforcement thinning procedures are added to FCT, for example demand fading (e.g., Berg et al., 2007) or contingency-based delay tolerance training (e.g., Ghaemmaghami et al., 2016), which may be due to the explicit efforts to strengthen other contextually appropriate responses in addition to communication. Regarding collateral reductions, Wacker et al. (1998) reported a decrease in other minor topographies of problem behavior not directly treated by FCT and Scalzo et al. (2015) showed that FCT resulted in increased task engagement and lower levels of nontargeted challenging behavior. These effects may not be transient; Berg et al. (2007) showed the secondary effects of FCT on social interactions and task completion in both training and generalization contexts across settings and people.

Negative side effects of treatment were rarely evaluated. A total of 20 applications (out of 744) monitored negative side-effects, 11 (i.e., 55%) of which showed a worsening or a negative side-effect (Figure 1). For example, Fisher et al. (2000) showed that the introduction of delay resulted in an increase in stereotypy and inappropriate sexual behavior. Ghaemmaghami et al. (2016) reported on collateral effects such as excessive manding, emotional responding, and the emergence of novel topographies of problem behavior during timebased reinforcement delays. Similar to Fisher et al., time-based delays resulted in high levels of these negative collateral responses, whereas zero or near-zero rates of these responses were observed with contingency-based delays.

General effects of FCT on global functioning are also rarely evaluated. Only 5% of applications (37 out of 744) have evaluated general effects, with 92% of them showing a positive effect (Figure 1). Improvements in the quality of life following FCT implementation are rarely reported, except when FCT is part of a multicomponent intervention. For example, Carr & Carlson (1993) and Kemp & Carr (1995) noted significant improvements in their participants’ ability to engage in community-based activities such as grocery shopping and vocational training. McConnachie and Carr (1997) found that FCT resulted in lower levels of teacher self-reported stress, and more productive teaching sessions than escape extinction. Finally, Olive et al. (2008) found an increase in the overall rate of learning on standardized language tests, but no other researchers have examined general effects using global and standardized measures of adaptive and play skills, language, or intellectual quotient (IQ).

In summary, FCT has been shown to result in noteworthy increases in adaptive collateral responses such as compliance, play, task engagement, and social interaction, in addition to its direct effects on problem behavior. However, the extent to which these changes result in socially significant improvements in the overall adaptive functioning of the individual and overall improvements in quality of life, without adverse side-effects, remains largely unknown. Future research on FCT should include outcome measures of the indirect and general effects on adaptive functioning, quality of life, and the overall health of the individual and family. As reinforcement thinning procedures and other procedures are added to FCT, the relative utility of each combination may best be examined in their indirect effects on task engagement, compliance, social interaction, and emotional responding. Global measures such as the Child Behavior Checklist (Achenbach & Rescorla, 2000), the Vineland Adaptive Behavior Scales, Third Edition (Sparrow et al., 2016), the Verbal Behavior Milestones Assessment and Placement Program (Sundberg, 2008), or the Peabody Picture Vocabulary Test – 4 (Dunn & Dunn, 2007) may be useful for detecting general effects of FCT in the type of large-scale randomized clinical trials that are valued by relevant stakeholders. The potential limitations of group designs (e.g., masking individual outcomes) can be addressed by combining group designs with single-subject analysis of individual data (e.g., Hagopian, 2020; Luczynski & Hanley, 2013).

Conclusions and Final Recommendations

This review indicates that implementation of FCT often results in large reductions in problem behavior while simultaneously increasing adaptive behaviors (e.g., communication, delay tolerance, social interaction). FCT is a flexible procedure that can be implemented by parents, teachers, and staff in many contexts. The procedure can be adapted to the needs of the individual (e.g., specific mode of communication) and the expectations of the social environment (e.g., completion of a work sequence prior to asking for a break).

Despite the breadth of these positive outcomes, the overwhelming majority of FCT research has focused on efficacy rather than effectiveness, researcher implementation rather than natural change agents, and rich reinforcement schedules rather than lean (bottom of Figure 1). There have been no applications of FCT implemented by caregivers in relevant contexts with demonstrated maintenance, generality, social validity, and positive secondary and general effects (top of Figure 1). This discrepancy between the amount of efficacy research and the amount of effectiveness research is also common for psychological therapies (Hoagwood et al., 1995). Nonetheless, both efficacy and effectiveness studies are important for a full exploration of the parameters of lasting and meaningful treatment benefits in clinical services. Currently the gap between these literatures requires a sizable inferential leap from highly controlled studies to family and practitioner implementation of FCT in typical settings.

The time seems right for a comprehensive evaluation of the effectiveness of FCT, and potentially other procedures. The most appropriate next step may be evaluations of efficacy and effectiveness within the same study incorporating the relevant experimental controls while also evaluating issues related to real world implementation. In the case of FCT, large scale, longitudinal, multicomponent evaluations that measure the feasibility, generality, and social validity of FCT will probably best fill the existing research-to-practice gap. The balance of internal and ecological validity may be best achieved by starting in specialized or contrived settings with expert implementation and ending with caregivers implementing the procedures under practical reinforcement schedules and in relevant environments (as exemplified in Berg et al., 2007; Hanley et al., 2014). This type of research entails extensive long-term monitoring of treatment to capture more than the effects on problem behavior (as exemplified in Carr et al., 1999; Jensen et al., 2001) including the more general effects of treatment on individuals’ overall functioning (as exemplified in Olive et al., 2008). Another design suited to addressing questions of effectiveness in behavior analytic research is the consecutive controlled case series (CCCS) which maintains the tenants of single-subject design while simultaneously addressing some questions of generality (see Hagopian, 2020; Hagopian et al., 2013 and Jessel et al., 2018). In a CCCS study, all cases that have received the treatment are included irrespective of outcome, which enables researchers to better identify any potential boundaries of generality and reduce the publication bias toward successful outcomes. The goal would be to establish the conditions for effectiveness of various procedural variations and guide practitioners’ selection and implementation of treatment components based on variables such as participant characteristics, family values and sociocultural contexts.

In summary, there are significant strengths and limitations in the body of literature on FCT. The extent to which clinicians and families are encountering successful outcomes in general practice (i.e., effectiveness of FCT as an intervention) is not known because the effectiveness research is so limited. This research has likely not occurred because it is expensive and effortful for the researcher, but these barriers can be overcome. Both public and private grant funding could be allocated to effectiveness research that incorporates single- and between-subject designs that satisfy granting agencies. Another shift in contingencies could also be implemented through editorial prompts signaling reinforcement for including aspects of effectiveness (e.g., long-term evaluations, measures of social validity and secondary and general effects) in submitted research (e.g., Hanley, 2017).

FCT is an efficacious treatment that is ready to be evaluated as an evidence-based practice. In order to also be recognized as an effective treatment, as conveyed by Baer et al. (1968), and as an EBPP, future research on FCT will need to focus more closely on issues related to home, school, and community application, feasibility, and consumer satisfaction. What may be discovered is that FCT alone is not an EBPP, but rather, it is an essential component of a larger comprehensive treatment package that includes skill-based tolerance procedures that strengthen additional social skills such as compliance, task engagement, and social interaction, and therefore facilitate important global changes for the learner and his or her family members and teachers.

References

Achenbach, T. M., & Rescorla, L. A. (2000). Manual for the ASEBA preschool forms and profiles. University of Vermont Department of Psychiatry.

American Psychological Association, Presidential Task Force on Evidence-Based Practice. (2006). Evidence-based practice in psychology. American Psychologist, 61(4), 271–285. https://doi.org/10.1037/0003-066X.61.4.271

Baer, D. M., Wolf, M. M., & Risley, T. R. (1968). Some current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis, 1(1), 91–97. https://doi.org/10.1901/jaba.1968.1-91

Bailey, J., McComas, J. J., Benavides, C., & Lovascz, C. (2002). Functional assessment in a residential setting: Identifying an effective communicative replacement response for aggressive behavior. Journal of Developmental and Physical Disabilities, 14(4), 353–369. https://doi.org/10.1023/A:1020382819146

Berg, W. K., Wacker, D. P., Harding, J. W., Ganzer, J., & Barretto, A. (2007). An evaluation of multiple dependent variables across distinct classes of antecedent stimuli pre and post functional communication training. Journal of Early and Intensive Behavior Intervention, 4(1), 305–333. https://doi.org/10.1037/h0100346

Bouton, M. E., Winterbauer, N. E., & Todd, T. P. (2012). Relapse processes after the extinction of instrumental learning: Renewal, resurgence, and reacquisition. Behavioural Processes, 90(1), 130–141. https://doi.org/10.1016/j.beproc.2012.03.004

Bowman, L. G., Fisher, W. W., Thompson, R. H., & Piazza, C. C. (1997). On the relation of mands and the function of destructive behavior. Journal of Applied Behavior Analysis, 30(2), 251–265. https://doi.org/10.1901/jaba.1997.30-251

Campbell, R. V., & Lutzker, J. R. (1993). Using functional equivalence training to reduce severe challenging behavior: A case study. Journal of Developmental and Physical Disabilities, 5(3), 203–216. https://doi.org/10.1007/BF01047064

Carr, E. G., & Carlson, J. I. (1993). Reduction of severe behavior problems in the community using a multicomponent treatment approach. Journal of Applied Behavior Analysis, 26(2), 157–172. https://doi.org/10.1901/jaba.1993.26-157

Carr, E. G., & Durand, V. M. (1985). Reducing behavior problems through functional communication training. Journal of Applied Behavior Analysis, 18(2), 111–126. https://doi.org/10.1901/jaba.1985.18-111

Carr, E. G., & Kemp, D. C. (1989). Functional equivalence of autistic leading and communicative pointing: Analysis and treatment. Journal of Autism and Developmental Disorders, 19(4), 561–578. https://doi.org/10.1007/BF02212858

Carr, E. G., Levin, L., McConnachie, G., Carlson, J. I., Kemp, D. C., Smith, C. E., & McLaughlin, D. M. (1999). Comprehensive multisituational intervention for problem behavior in the community: Long-term maintenance and social validation. Journal of Positive Behavior Interventions, 1(1), 5–25. https://doi.org/10.1177/109830079900100103

Carroll, K. M., & Rounsaville, B. J. (2003). Bridging the gap: A hybrid model to link efficacy and effectiveness research in substance abuse treatment. Psychiatric Services, 54(3), 333–339. https://doi.org/10.1176/appi.ps.54.3.333

Chambless, D. L., & Hollon, S. D. (1998). Defining empirically supported therapies. Journal of Consulting and Clinical Psychology, 66(1), 7–18. https://doi.org/10.1037/0022-006X.66.1.7

Correa, D. J., Kwon, C., Connors, S., Fureman, B., Whittemore, V., Jetté, N., & Moshé, S. L. (2019). Applying participatory action research in traumatic brain injury studies to prevent post-traumatic epilepsy. Neurobiology of Disease, 123, 137–144. https://doi.org/10.1016/j.nbd.2018.07.007

Derby, K. M., Wacker, D. P., Berg, W., Deraad, A., Ulrich, S., Asmus, J., & Stoner, E. A. (1997). The long-term effects of functional communication training in home settings. Journal of Applied Behavior Analysis, 30(3), 507–531. https://doi.org/10.1901/jaba.1997.30-507

Dunlap, G., Ester, T., Langhans, S., & Fox, L. (2006). Functional communication training with toddlers in home environments. Journal of Early Intervention, 28(2), 81–96. https://doi.org/10.1177/105381510602800201

Dunn, M., & Dunn, L. M. (2007). Peabody Picture Vocabulary Test—Fourth Edition. American Guidance Service.

Durand, V. M. (1999). Functional communication training using assistive devices: Recruiting natural communities of reinforcement. Journal of Applied Behavior Analysis, 32(3), 247–267. https://doi.org/10.1901/jaba.1999.32-247

Durand, V. M., & Carr, E. G. (1991). Functional communication training to reduce challenging behavior: Maintenance and application in new settings. Journal of Applied Behavior Analysis, 24(2), 251–264. https://doi.org/10.1901/jaba.1991.24-251

Durand, V. M., & Carr, E. G. (1992). An analysis of maintenance following functional communication training. Journal of Applied Behavior Analysis, 25(4), 777–794. https://doi.org/10.1901/jaba.1992.25-777

Durand, V. M., & Crimmins, D. B. (1987). Assessment and treatment of psychotic speech in an autistic child. Journal of Autism and Developmental Disorders, 17(1), 17–28. https://doi.org/10.1007/BF01487257

Durand, V. M., & Kishi, G. (1987). Reducing severe behavior problems among persons with dual sensory impairments: An evaluation of a technical assistance model. Journal of the Association for Persons with Severe Handicaps, 12(1), 2–10. https://doi.org/10.1177/154079698701200102

Durand, V. M., & Moskowitz, L. (2015). Functional communication training: Thirty years of treating challenging behavior. Topics in Early Childhood Special Education, 35(2), 116–126. https://doi.org/10.1177/0271121415569509

Elliott, S. N., & Treuting, M. V. B. (1997). The Behavior Intervention Rating Scale: Development and validation of a pretreatment acceptability and effectiveness measure. Journal of School Psychology, 29(1), 43–51. https://doi.org/10.1016/0022-4405(91)90014-I

Falcomata, T. S., Muething, C. S., Gainey, S., Hoffman, K., & Fragale, C. (2013). Further evaluations of functional communication training and chained schedules of reinforcement to treat multiple functions of challenging behavior. Behavior Modification, 37(6), 723–746. https://doi.org/10.1177/0145445513500785

Fawcett, S. B. (1991). Some values guiding community research and action. Journal of Applied Behavior Analysis, 24(4), 612–636. https://doi.org/10.1901/jaba.1991.24-621

Fisher, W. W., Greer, B. D., Fuhrman, A. M., & Querim, A. C. (2015). Using multiple schedules during functional communication training to promote rapid transfer of treatment effects. Journal of Applied Behavior Analysis, 48(4), 1–21. https://doi.org/10.1002/jaba.254

Fisher, W. W., Greer, B. D., Querim, A. C., & DeRosa, N. (2014). Decreasing excessive functional communication responses while treating destructive behavior using response restriction. Research in Developmental Disabilities, 35(11), 2614–2623. https://doi.org/10.1016/j.ridd.2014.06.024

Fisher, W. W., Kuhn, D. E., & Thompson, R. H. (1998). Establishing discriminative control of responding using functional and alternative reinforcers during functional communication training. Journal of Applied Behavior Analysis, 31(4), 543–560. https://doi.org/10.1901/jaba.1998.31-543

Fisher, W., Piazza, C., Cataldo, M., Harrell, R., Jefferson, G., & Conner, R. (1993). Functional communication training with and without extinction and punishment. Journal of Applied Behavior Analysis, 26(1), 23–36. https://doi.org/10.1901/jaba.1993.26-23

Fisher, W. W., Thompson, R. H., Hagopian, L. P., Bowman, L. G., & Krug, A. (2000). Facilitating tolerance of delayed reinforcement during functional communication training. Behavior Modification, 24(1), 3–29. https://doi.org/10.1177/0145445500241001

Flood, W. A., & Wilder, D. A. (2002). Antecedent assessment and assessment-based treatment of off-task behavior in a child diagnosed with attention deficit–hyperactivity disorder (ADHD). Education & Treatment of Children, 25(3), 331–338. https://www.jstor.org/stable/42899709

Fox, L., Vaughn, B. J., Wyatte, M. L., & Dunlap, G. (2002). “We can’t expect other people to understand”: Family perspectives on problem behavior. Exceptional Children, 68(4), 437–450. https://doi.org/10.1177/001440290206800402

Gabor, A. M., Fritz, J. N., Roath, C. T., Rothe, B. R., & Gourley, D. A. (2016). Caregiver preference for reinforcement-based interventions for problem behavior maintained by positive reinforcement. Journal of Applied Behavior Analysis, 49(2), 215–227. https://doi.org/10.1002/jaba.286

Ghaemmaghami, M., Hanley, G. P., & Jessel, J. (2016). Contingencies promote delay tolerance. Journal of Applied Behavior Analysis, 49(3), 548–575. https://doi.org/10.1002/jaba.333

Glasgow, R., Green, L., Klesges, L., Abrams, D., Fisher, E., Goldstein, M., & Orleans, T. (2006). External validity: We need to do more. Annals of Behavioral Medicine, 31(2), 105–108. https://doi.org/10.1207/s15324796abm3102_1

Glasgow, R., Lichtenstein, E., & Marcus, A. (2003). Why don’t we see more translation of health promotion research to practice? Rethinking the efficacy-to-effectiveness transition. American Journal of Public Health, 93(8), 1261–1267. https://doi.org/10.2105/AJPH.93.8.1261

Greer, B. D., Fisher, W. W., Briggs, A. M., Lichtblau, K. R., Phillips, L. A., & Mitteer, D. R. (2019). Using schedule-correlated stimuli during functional communication training to promote the rapid transfer of treatment effects. Behavioral Development, 24(2), 100–119. https://doi.org/10.1037/bdb0000085

Greer, B. D., Fisher, W. W., Saini, V., Owen, T. M., & Jones, J. K. (2016). Functional communication training during reinforcement schedule thinning: An analysis of 25 applications. Journal of Applied Behavior Analysis, 49(1), 1–17. https://doi.org/10.1002/jaba.265

Hagopian, L. P. (2020). The consecutive controlled case series: Design, data-analytics, and reporting methods supporting the study of generality. Journal of Applied Behavior Analysis, 53(2), 596–619. https://doi.org/10.1002/jaba.691

Hagopian, L. P., Boelter, E. W., & Jarmolowicz, D. P. (2011). Reinforcement schedule thinning following functional communication training: Review and recommendations. Behavior Analysis in Practice, 4, 4–16. https://doi.org/10.1007/BF03391770

Hagopian, L. P., Bruzek, J. L., Bowman, L. G., & Jennett, H. K. (2007). Assessment and treatment of problem behavior occasioned by interruption of free-operant behavior. Journal of Applied Behavior Analysis, 40(1), 89–103. https://doi.org/10.1901/jaba.2007.63-05

Hagopian, L. P., Contrucci Kuhn, S. A., Long, E. S., & Rush, K. S. (2005). Schedule thinning following communication training: Using competing stimuli to enhance tolerance to decrements in reinforcer density. Journal of Applied Behavior Analysis, 38(2), 177–193. https://doi.org/10.1901/jaba.2005.43-04

Hagopian, L. P., Fisher, W. W., Sullivan, M. T., Acquisto, J., & LeBlanc, L. A. (1998). Effectiveness of functional communication training with and without extinction and punishment: A summary of 21 inpatient cases. Journal of Applied Behavior Analysis, 31(2), 211–235. https://doi.org/10.1901/jaba.1998.31-211

Hagopian, L. P., Rooker, G. W., Jessel, J., & DeLeon, I. G. (2013). Initial functional analysis outcomes and modifications in pursuit of differentiation: A summary of 176 inpatient cases. Journal of Applied Behavior Analysis, 46(1), 88–100. https://doi.org/10.1002/jaba.25

Hallfors, D., & Cho, H. (2007). Moving behavioral science from efficacy to effectiveness. International Journal of Behavioral Consultation and Therapy, 3(2), 236–250. https://doi.org/10.1037/h0100801

Hanley, G. P. (2017). Editor’s note. Journal of Applied Behavior Analysis, 50(1), 3–7. https://doi.org/10.1002/jaba.366

Hanley, G. P., Iwata, B. A., & Thompson, R. H. (2001). Reinforcement schedule thinning following treatment with functional communication training. Journal of Applied Behavior Analysis, 34(1), 17–38. https://doi.org/10.1901/jaba.2001.34-17

Hanley, G. P., Jin, C. S., Vanselow, N. R., & Hanratty, L. A. (2014). Producing meaningful improvements in problem behavior of children with autism via synthesized analyses and treatments. Journal of Applied Behavior Analysis, 47(1), 16–36. https://doi.org/10.1002/jaba.106

Hanley, G. P., Piazza, C. C., Fisher, W. W., Contrucci, S. A., & Maglieri, K. A. (1997). Evaluation of client preference for function-based treatment packages. Journal of Applied Behavior Analysis, 30(3), 459–473. https://doi.org/10.1901/jaba.1997.30-459

Hanley, G. P., Piazza, C. C., Fisher, W. W., & Maglieri, K. A. (2005). On the effectiveness of and preference for punishment and extinction components of function-based interventions. Journal of Applied Behavior Analysis, 38(1), 51–65. https://doi.org/10.1901/jaba.2005.6-04

Hawkins, R. P. (1991). Is social validity what we are interested in? Argument for a functional approach. Journal of Applied Behavior Analysis, 24(2), 205–213. https://doi.org/10.1901/jaba.1991.24-205

Heath, A. K., Ganz, J. B., Parker, R., Burke, M., & Ninci, J. (2015). A meta-analytic review of functional communication training across mode of communication, age, and disability. Review Journal of Autism and Developmental Disorders, 2(2), 155–166. https://doi.org/10.1007/s40489-014-0044-3

Hoagwood, K., Burns, B. J., Kiser, L., Ringeisen, H., & Schoenwald, S. K. (2001). Evidence-based practice in child and adolescent mental health services. Psychiatric Services, 52(9), 1179–1189. https://doi.org/10.1176/appi.ps.52.9.1179

Hoagwood, K., Hibbs, E., Brent, D., & Jensen, P. (1995). Introduction to the special section: Efficacy and effectiveness in studies of child and adolescent psychotherapy. Journal of Consulting and Clinical Psychology, 63(5), 683–687. https://doi.org/10.1037/0022-006X.63.5.683

Horner, R. H., & Budd, C. M. (1985). Acquisition of manual sign use: Collateral reduction of maladaptive behavior, and factors limiting generalization. Education and Training of the Mentally Retarded, 20(1), 39–47. https://www.jstor.org/stable/23877281

Ivankova, N. V. (2017). Applying mixed methods in community-based participatory action research: A framework for engaging stakeholders with research as a means for promoting patient-centredness. Journal of Research in Nursing, 22(4), 282–294. https://doi.org/10.1177/1744987117699655

Jensen, C. C., McConnachie, G., & Pierson, T. (2001). Long-term multicomponent intervention to reduce severe problem behavior: A 63-month evaluation. Journal of Positive Behavior Interventions, 3(4), 225–236. https://doi.org/10.1177/109830070100300404

Jessel, J., Ingvarsson, E. T., Metras, R., Kirk, H., & Whipple, R. (2018). Achieving socially significant reductions in problem behavior following the interview-informed synthesized contingency analysis: A summary of 25 outpatient applications. Journal of Applied Behavior Analysis, 51(1), 130–157. https://doi.org/10.1002/jaba.436

Kelley, M. E., Liddon, C. J., & Ribeiro, A. (2015). Basic and translational evaluation of renewal of operant responding. Journal of Applied Behavior Analysis, 48(2), 390–401. https://doi.org/10.1002/jaba.209

Kemp, D. C., & Carr, E. G. (1995). Reduction of severe problem behavior in community employment using an hypothesis-driven multicomponent intervention approach. Journal of the Association for Persons with Severe Handicaps, 20(4), 229–247. https://doi.org/10.1177/154079699602000401

Kuhn, D. E., Chirighin, A. E., & Zelenka, K. (2010). Discriminated functional communication: A procedural extension of functional communication training. Journal of Applied Behavior Analysis, 43(2), 249–264. https://doi.org/10.1901/jaba.2010.43-249

Kurtz, P. F., Boelter, E. W., Jarmolowicz, D. P., Chin, M. D., & Hagopian, L. P. (2011). An analysis of functional communication training as an empirically supported treatment for problem behavior displayed by individuals with intellectual disabilities. Research in Developmental Disabilities, 32(6), 2935–2942. https://doi.org/10.1016/j.ridd.2011.05.009

Kurtz, P. F., Chin, M. D., Huete, J. M., Tarbox, R. S., O’Connor, J. T., Paclawskyj, T. R., & Rush, K. S. (2003). Functional analysis and treatment of self-injurious behavior in young children: A summary of 30 cases. Journal of Applied Behavior Analysis, 36(2), 205–219. https://doi.org/10.1901/jaba.2003.36-205

Kurtz, P. F., Chin, M. D., Robinson, A. N., O’Connor, J. T., & Hagopian, L. P. (2015). Functional analysis and treatment of problem behavior exhibited by children with fragile X syndrome. Research in Developmental Disabilities, 43–44, 150–166. https://doi.org/10.1016/j.ridd.2015.06.010

Lalli, J. S., Casey, S., & Kates, K. (1995). Reducing escape behavior and increasing task completion with functional communication training, extinction, and response chaining. Journal of Applied Behavior Analysis, 28(3), 261–268. https://doi.org/10.1901/jaba.1995.28-261

Luczynski, K. C., & Hanley, G. P. (2013). Preventing the development of problem behavior by teaching functional communication and self-control skills to preschoolers. Journal of Applied Behavior Analysis, 46(2), 355–368. https://doi.org/10.1002/jaba.44

Luczynski, K. C., Hanley, G. P., & Rodriguez, N. M. (2014). An evaluation of the generalization and maintenance of functional communication and self-control skills with preschoolers. Journal of Applied Behavior Analysis, 47(2), 246–263. https://doi.org/10.1002/jaba.128

Mancil, G. R., Conroy, M. A., & Haydon, T. F. (2009). Effects of a modified milieu therapy intervention on the social communicative behaviors of young children with autism spectrum disorders. Journal of Autism and Developmental Disorders, 39(1), 149–163. https://doi.org/10.1007/s10803-008-0613-3

Marchand, E., Stice, E., Rohde, P., & Becker, C. B. (2011). Moving from efficacy to effectiveness trials in prevention research. Behaviour Research and Therapy, 49(1), 32–41. https://doi.org/10.1016/j.brat.2010.10.008

McConnachie, G., & Carr, E. G. (1997). The effects of child behavior problems on the maintenance of intervention fidelity. Behavior Modification, 21(2), 123–158. https://doi.org/10.1177/01454455970212001

McCurdy, B. L., Thomas, L., Truckenmiller, A., Rich, S. H., Hillis-Clark, P., & Lopez, J. C. (2016). School-wide positive behavioral interventions and supports for students with emotional and behavioral disorders. Psychology in the Schools, 53(4), 375–389. https://doi.org/10.1002/pits.21913

Mirenda, P. (1997). Supporting individuals with challenging behavior through functional communication training and AAC: Research review. Augmentative and Alternative Communication, 13(4), 207–225. https://doi.org/10.1080/07434619712331278048

Mitteer, D. R., Fisher, W. W., Briggs, A. M., Greer, B. D., & Hardee, A. M. (2019). Evaluation of an omnibus mand in the treatment of multiply controlled destructive behavior. Behavioral Development, 24(2), 74–88. https://doi.org/10.1037/bdb0000088

Moes, D. R., & Frea, W. D. (2002). Contextualized behavioral support in early intervention for children with autism and their families. Journal of Autism and Developmental Disorders, 32(6), 519–533. https://doi.org/10.1023/A:1021298729297

Northup, J., Wacker, D. P., Berg, W. K., Kelly, L., Sasso, G., & DeRaad, A. (1994). The treatment of severe behavior problems in school settings using a technical assistance model. Journal of Applied Behavior Analysis, 27(1), 33–47. https://doi.org/10.1901/jaba.1994.27-33

Olive, M. L., Lang, R. B., & Davis, T. N. (2008). An analysis of the effects of functional communication and a voice output communication aid for a child with autism spectrum disorder. Research in Autism Spectrum Disorders, 2(2), 223–236. https://doi.org/10.1016/j.rasd.2007.06.002

Rispoli, M., Camargo, S., Machalicek, W., Lang, R., & Sigafoos, J. (2014). Functional communication training in the treatment of problem behavior maintained by access to rituals. Journal of Applied Behavior Analysis, 47(3), 580–593. https://doi.org/10.1002/jaba.130

Roane, H. S., Fisher, W. W., Sgro, G. M., Falcomata, T. S., & Pabico, R. R. (2004). An alternative method of thinning reinforcer delivery during differential reinforcement. Journal of Applied Behavior Analysis, 37(2), 213–218. https://doi.org/10.1901/jaba.2004.37-213

Rooker, G. W., Jessel, J., Kurtz, P. F., & Hagopian, L. P. (2013). Functional communication training with and without alternative reinforcement and punishment: An analysis of 58 applications. Journal of Applied Behavior Analysis, 46(4), 708–722. https://doi.org/10.1002/jaba.76

Rose, J. C., & Beaulieu, L. (2018). Assessing the generality and durability of interview-informed functional analyses and treatment. Journal of Applied Behavior Analysis, 52(1), 271–285. https://doi.org/10.1002/jaba.504

Rosenthal, R. (1979). File drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641. https://doi.org/10.1037/0033-2909.86.3.638

Santiago, J. L., Hanley, G. P., Moore, K., & Jin, C. S. (2016). The generality of interview-informed functional analyses: Systematic replication in school and home. Journal of Autism and Developmental Disorders, 46(3), 797–811. https://doi.org/10.1007/s10803-015-2617-0

Sarno, J. M., Sterling, H. E., Mueller, M. M., Dufrene, B., Tingstrom, D. H., & Olmi, D. J. (2011). Escape-to-attention as a potential variable for maintaining problem behavior in the school setting. School Psychology Review, 40(1), 57–71. https://www.tandfonline.com/doi/abs/10.1080/02796015.2011.12087728

Scalzo, R., Henry, K., Davis, T. N., Amos, K., Zoch, T., Turchan, S., & Wagner, T. (2015). Evaluation of interventions to reduce multiply controlled vocal stereotypy. Behavior Modification, 39(4), 496–509. https://doi.org/10.1177/0145445515573986

Schindler, H. R., & Horner, R. H. (2005). Generalized reduction of problem behavior of young children with autism: Building trans-situational interventions. American Journal on Mental Retardation, 110(1), 36–47. https://doi.org/10.1352/0895-8017(2005)110<36:GROPBO>2.0.CO;2

Schwartz, I. S., & Baer, D. M. (1991). Social validity assessments: Is current practice state of the art? Journal of Applied Behavior Analysis, 24(2), 189–204. https://doi.org/10.1901/jaba.1991.24-189

Singal, A. G., Higgins, P. D., & Waljee, A. K. (2014). A primer on effectiveness and efficacy trials. Clinical and Translational Gastroenterology, 5(1), e45. https://doi.org/10.1038/ctg.2013.13

Slaton, J. D., & Hanley, G. P. (2018). Nature and scope of synthesis in functional analysis of problem behavior. Journal of Applied Behavior Analysis, 51(4), 943–973. https://doi.org/10.1002/jaba.498

Slaton, J. D., Hanley, G. P., & Raftery, K. (2017). Interview-informed functional analyses: A comparison of synthesized and isolated variables. Journal of Applied Behavior Analysis, 50(2), 252–277. https://doi.org/10.1002/jaba.384

Smith, T. (2012). Evolution of research on interventions for individuals with autism spectrum disorder: Implications for behavior analysts. The Behavior Analyst, 35(1), 101–113. https://doi.org/10.1007/BF03392269

Smith, T., Scahill, L., Dawson, G., Guthrie, D., Lord, C., Odom, S., … Wagner, A. (2007). Designing research studies on psychosocial interventions in autism. Journal of Autism and Developmental Disorders, 37(2), 354–366. https://doi.org/10.1007/s10803-006-0173-3

Sparrow, S. S., Cicchetti, D. V., & Saulnier, C. A. (2016). Vineland Adaptive Behavior Scales, Third Edition (Vineland-3). Pearson. https://www.pearsonassessments.com

Steege, M. W., Wacker, D. P., Cigrand, K. C., Berg, W. K., Novak, C. G., Reimers, T. M., … DeRaad, A. (1990). Use of negative reinforcement in the treatment of self-injurious behavior. Journal of Applied Behavior Analysis, 23(4), 459–467. https://doi.org/10.1901/jaba.1990.23-459

Stokes, T. F., & Baer, D. M. (1977). An implicit technology of generalization. Journal of Applied Behavior Analysis, 10(2), 349–367. https://doi.org/10.1901/jaba.1977.10-349

Stokes, T. F., & Osnes, P. G. (1989). An operant pursuit of generalization. Behavior Therapy, 20(3), 337–355. https://doi.org/10.1016/S0005-7894(89)80054-1

Suess, A. N., Romani, P. W., Wacker, D. P., Dyson, S. M., Kuhle, J. L., Lee, J. F., … Waldron, D. B. (2014). Evaluating the treatment fidelity of parents who conduct in-home functional communication training with coaching via telehealth. Journal of Behavioral Education, 23(1), 34–59. https://doi.org/10.1007/s10864-013-9183-3

Sundberg, M. L. (2008). Verbal Behavior Milestones Assessment and Placement Program (VB-MAPP). AVB Press. https://www.avbpress.com

Task Force on Promotion and Dissemination of Psychological Procedures. (1995). Training in and dissemination of empirically-validated psychological treatments. The Clinical Psychologist, 48(1), 3–23.

Tiger, J. H., Hanley, G. P., & Bruzek, J. (2008). Functional communication training: A review and practical guide. Behavior Analysis in Practice, 1(1), 16–23. https://doi.org/10.1007/BF03391716

Wacker, D. P., Berg, W. K., Harding, J. W., Barretto, A., Rankin, B., & Ganzer, J. (2005). Treatment effectiveness, stimulus generalization, and acceptability to parents of functional communication training. Educational Psychology, 25(2–3), 233–256. https://doi.org/10.1080/0144341042000301184

Wacker, D. P., Berg, W. K., Harding, J. W., Derby, M. K., Asmus, J. M., & Healy, A. (1998). Evaluation and long-term treatment of aberrant behavior displayed by young children with disabilities. Journal of Developmental & Behavioral Pediatrics, 19(4), 260–266.

Wacker, D. P., Harding, J. W., Berg, W. K., Lee, J. F., Schieltz, K. M., Padilla, Y. C., … Shahan, T. A. (2011). An evaluation of persistence of treatment effects during long-term treatment of destructive behavior. Journal of the Experimental Analysis of Behavior, 96(2), 261–282. https://doi.org/10.1901/jeab.2011.96-261

Wacker, D. P., Harding, J. W., Morgan, T. A., Berg, W. K., Schieltz, K. M., Lee, J. F., & Padilla, Y. C. (2013). An evaluation of resurgence during functional communication training. The Psychological Record, 63(1), 3–20. https://doi.org/10.11133/j.tpr.2013.63.1.001

Wacker, D. P., Lee, J. F., Dalmau, Y. C. P., Kopelman, T. G., Lindgren, S. D., Kuhle, J., … Waldron, D. B. (2013). Conducting functional communication training via telehealth to reduce the problem behavior of young children with autism. Journal of Developmental and Physical Disabilities, 25(1), 35–48. https://doi.org/10.1007/s10882-012-9314-0

Welsh, T. M., Miller, L. K., & Altus, D. E. (1994). Programming for survival: A meeting system that survives 8 years later. Journal of Applied Behavior Analysis, 27(3), 423–433. https://doi.org/10.1901/jaba.1994.27-423

White, G. W. (2002). Consumer participation in disability research: The golden rule as a guide for ethical practice. Rehabilitation Psychology, 47(4), 438–446. https://doi.org/10.1037/0090-5550.47.4.438

Wolf, M. M. (1978). Social validity: The case for subjective measurement or how applied behavior analysis is finding its heart. Journal of Applied Behavior Analysis, 11(2), 203–214. https://doi.org/10.1901/jaba.1978.11-203

Zangrillo, A. N., Fisher, W. W., Greer, B. D., Owen, T. M., & DeSouza, A. A. (2016). Treatment of escape-maintained challenging behavior using chained schedules. International Journal of Development Disabilities, 62(3), 147–156. https://doi.org/10.1080/20473869.2016.1176308

Zwarenstein, M., & Treweek, S. (2009). What kind of randomized trials do we need? Canadian Medical Association Journal, 180(10), 998–1000. https://doi.org/10.1503/cmaj.082007

Zwarenstein, M., Treweek, S., Gagnier, J., Altman, D., Tunis, S., Haynes, B., … Moher, D. (2008). Improving the reporting of pragmatic trials. British Medical Journal, 337, a2390–a2397. https://doi.org/10.1136/bmj.a2390

Received February 2, 2018
Final acceptance July 23, 2020
Action Editor, Louis Hagopian

Supporting information
Additional Supporting Information may be found in the online version of this article at the publisher’s website.

Effects Of Two Variations Of Differential Reinforcement On Prompt Dependency

Jan 2, 2026

Abstract

Prompt dependency is an often referenced but little studied problem. The current study evaluated 2 iterations of differential reinforcement (DR) for overcoming prompt dependency and facilitating skill acquisition with 4 individuals who had been diagnosed with an autism spectrum disorder (ASD). Preference and reinforcer assessments were conducted to determine moderately and highly preferred reinforcers for each participant. Three sets of word–picture relations were taught to each of the participants using 1 of 3 DR procedures. Reinforcement for independent responses entailed delivery of the highest preference stimulus across all 3 procedures. Consequences for prompted responses entailed delivery of the highest preference stimulus (no DR), delivery of the moderately preferred stimulus (DR high/moderate), or no delivery of reinforcers (DR high/extinction). Results indicated that the DR high/moderate condition was most effective for 3 of 4 participants, whereas the DR high/extinction condition was most effective for the remaining participant.

Key words: autism, differential reinforcement, prompt dependency, discrimination

We thank the New England Center for Children (NECC) and Northeastern University for their contributions to the field of applied behavior analysis. We also extend appreciation to Amy Constantine and Pamela Sinclair for their help with data collection. Catia Cividini-Motta and William H. Ahearn are now at Western New England University as well as NECC. Correspondence concerning this article should be addressed to Catia Cividini-Motta, The New England Center for Children, 33 Turnpike Road, Southborough, Massachusetts 01772 (e-mail: ccividini@necc.org). doi: 10.1002/jaba.67

Prompt dependency is an often referenced problem encountered in the education of persons with disabilities (Oppenheimer, Saunders, & Spradlin, 1993). Prompts and prompt fading are common teaching strategies used for producing discriminative repertoires while striving to minimize errors (e.g., Fisher, Kodak, & Moore, 2007). However, for certain individuals, attempts to fade response prompts are unsuccessful, and correct responses are emitted only when the controlling prompt is presented. This phenomenon was defined by Clark and Green (2004) as prompt dependency. Applied research examining procedures for decreasing and preventing the development of prompt dependency is limited.

One variable that may affect prompt dependency is reinforcement. During skill acquisition, differential reinforcement is often in place for correct and incorrect responses such that both responses receive reinforcement while incorrect responses are placed on extinction. Providing the same reinforcer for both correct prompted and correct independent responses, as is common in errorless teaching procedures, may produce prompt dependency. Olenick and Pear (1980) and Touchette and Howard (1984) used differential reinforcement, either alone or in combination with other procedures, to teach tacts and auditory–visual discriminations, respectively. Both studies found that the providing more dense reinforcement schedules for unprompted responses than for prompted responses enhanced acquisition.

In a recent evaluation of the effects of differential reinforcement on skill acquisition, Karsten and Carr (2009) manipulated the quality of reinforcers rather than the rate of reinforcement, as in Olenick and Pear (1980) and reinforcement was more efficient for producing mastery in the few cases in which results were differentiated.

Taken together, these studies suggest that differential reinforcement might facilitate skill acquisition. However, prior research has not demonstrated the degree of differential reinforcement necessary to produce mastery in participants with a reported history of prompt dependency. Therefore, the purpose of this study was to extend the research of Karsten and Carr (2009) by investigating the effects of two iterations of differential reinforcement: high-preference reinforcers versus moderate-preference reinforcers, and high-preference reinforcers versus extinction.

METHOD

Participants

Participants were four individuals who lived in a residential facility for persons with autism spectrum disorders (ASD) or related disabilities and had been diagnosed with an ASD. The participants did not have a history with respect to differential reinforcement procedures in the context of skill acquisition. Eddie was a 16year-old boy who had been attending the residential facility for 10 years. He communicated using a voice-output device and manual signs. He followed two-step directives and had recently been diagnosed with a seizure disorder. Bill was a 12-year-old boy who had been attending the facility for 8 years. He communicated vocally as well as through the use of manual signs. Lucas was a 13-year-old boy who had been at the facility for 3 years. He communicated through the use of a voice-output device. Frank was a 38-year-old man who had resided at the residential facility for 23 years. Bill, Eddie, Lucas, and Frank followed multistep directives and, at the time of this study, they were not receiving any psychotropic medication.

All participants met an inclusion criterion based on demonstration of prompt dependency. First, each student was nominated by their clinical care providers as demonstrating prompt dependency. Next, an experimenter conducted at least two observations in which the participant completed a match-to-sample discrimination task. Trial-by-trial data were collected on whether the participant waited for the teacher prompt, which was delayed up to 10s. Students were selected as potential participants if they waited for the prompt on at least 80% of the trials across two nine-trial sessions. In addition, we reviewed the participants’ progress on the objectives of their individual education plans. If data indicated that the potential participant moved quickly through the prompting hierarchy of a learning objective, he or she was excluded from the study.

Setting and Materials

All sessions for Eddie, Bill, and Lucas during the pretraining phase were conducted in a room (1.5m by 3m) adjacent to their classroom. The room was equipped with a one-way panel, a table, two chairs, and a video camera. Pretraining sessions were conducted once or twice per week with each participant. Training sessions were conducted in the participants’ classroom or at their residence. All sessions of the pretraining and training phases for Frank were conducted in a room at his residence. Training sessions were conducted one to three times per day, typically 4 to 5 days per week, based on the participants’ availability.

Materials included tokens (poker chips) and a token board; colored construction papers that were associated with the different conditions during the pretraining assessments; pictures of a poker chip and a smiley face (3.8 cm by 3.8 cm); a red square (3.8cm by 3.8cm); data sheets; and a timer. Materials for the training phase included pictures (3.8 cm by 3.8 cm) of items commonly found in the participants’ environment and their corresponding Portuguese sight words; three-choice array data sheets; a three-stimulus presentation board; slant boards; preferred edible items; timers; pictures of a smiley face and of a poker chip; and a red square. Two different-colored circles were included in the pretraining assessments for Eddie, Bill, and Lucas because the free-operant response was target touching. For Frank, the free-operant response was shirt folding because he continued to engage in target touching when reinforcers were withheld.

Response Measurement

The dependent variables were the rate of responding per condition during the reinforcer assessment and the percentage of independent responses per session in the training phase. During the reinforcer assessments, data were collected on frequency of target touching (or shirts folded) per assessment component and the number of times each initial link was selected during the concurrent-chains reinforcer assessment. Initial-link selection was defined as the first contact of the participant’s hand with one of the items presented on the table. Target touching was defined as the participant making open-hand contact in an alternating manner (single hand contact with one and then the other target) with two different-colored targets placed on the table in front of him. Each alternation was scored as a single response, and repetitive contact with the same target was ignored. Shirt folding was defined as grabbing a shirt from the pile, laying it on the table, folding the shirt by matching the corners, and then placing it with the folded shirts. During the training phase, data were collected on the number of independent and prompted responses and the number of sessions to mastery. Independent responses were defined as any response emitted prior to the teacher prompt. If the step prescribed was a 2-s delay with manual guidance at the forearm, an independent response occurred when the participant touched one of the comparison stimuli before the 2-s delay. Prompted responses were defined as any response emitted following the teacher prompt. Errors were defined as the participant touching the incorrect comparison stimulus either before or after receiving a prompt.

Interobserver Agreement and Procedural Integrity

A second observer independently recorded data on the target responses. Interobserver agreement for the reinforcer assessment was calculated by dividing each session into 10-s intervals, calculating agreement scores for each interval, and then averaging these scores across the total number of intervals for each session. Agreement scores were calculated by dividing the smaller count by the larger count and converting the result to a percentage. Interobserver agreement for the concurrent-chains reinforcer assessment was calculated on a trial-by-trial basis. The total number of agreements was divided by the number of agreements plus disagreements and the result was converted to a percentage. Interobserver agreement was collected for over 33% of the sessions across both assessments. Mean agreement scores for the reinforcer assessment were 95% (range, 90% to 100%) for Eddie, 93% (range, 90% to 97%) for Bill, 96% (range, 92% to 100%) for Lucas, and 100% for Frank. Mean agreement scores for the concurrent-chains assessment were 98% (range, 97% to 100%) for Eddie, 99% (range, 98% to 100%) for Bill, 97% for Lucas (range, 94% to 100%), and 100% for Frank.

A second observer independently collected data during over 33% of sessions across training conditions. Interobserver agreement was calculated on a trial-by-trial basis. The total number of agreements was divided by the number of agreements plus disagreements and the result was converted to a percentage. Mean agreement scores for the training phase were 99% (range, 98% to 100%) for Eddie and Bill, 96% (range, 94% to 100%) for Lucas, and 97% (range, 95% to 100%) for Frank.

Procedural integrity data were collected during the training phase to insure that the teaching procedures were implemented as described in the protocol. Teacher performance was evaluated on whether the appropriate sample and comparison stimuli were presented as prescribed, whether the comparison stimuli were presented after the participant touched the sample stimulus, whether the prompt provided by the teacher corresponded to the prescribed prompt, and whether data were recorded after trial completion. Data were also collected on whether the materials necessary for the prescribed reinforcement condition were available. A procedural integrity score was calculated for each session by dividing the number of correctly implemented trials by the total number of trials and converting the result to a percentage. The mean procedural integrity scores were 99.7% (range, 97% to 100%) for Eddie, 99% (range, 96% to 100%) for Bill, 95% (range, 93% to 100%) for Lucas, and 94% (range, 92% 100%) for Frank.

Procedure

The two iterations of differential reinforcement compared in this study required the identification of reinforcers of various potency for each of the participants. Therefore, a succession of reinforcer assessments was completed.

Reinforcer Assessment 1: Multiple schedule. Reinforcer assessment procedures were based on those of Smaby, MacDonald, Ahearn, and Dube (2007). The multiple-schedule assessment investigated whether praise, token (or edible item), and token (or edible item) plus praise functioned as reinforcers. In the multiple-schedule assessments, an extinction component alternated with a reinforcement component a total of three times in the component sequence. The extinction component always preceded a reinforcement component. Each component was correlated with a colored sheet of construction paper to facilitate discrimination between conditions. Colors were randomly assigned to each component but remained the same across all participants and assessments. The frequency of multiple-schedule sessions varied across days and participants, but a maximum of one full sequence was completed per day (i.e., extinction, token, extinction, praise, extinction, token plus praise).

In both extinction and reinforcement components, two different-colored circles (or a pile of unfolded shirts) were placed on the table in front of the participant and were the targets that the participant touched during the free-operant response, target touching (see definition under description of dependent variables). Frank’s target response was shirt folding. Before starting the session, the teacher completed a forced exposure trial. Specifically, he or she stated the name of the color associated with each component (e.g., “red” for extinction) and manually guided the participant to complete the target response (e.g., target touching) six times. The programmed consequence associated with that component was provided after each guided response. Before beginning the component, the teacher then stated the color again and started the timer. Extinction sessions lasted 5 min or until the participant stopped responding for 1 min, whichever came first. Only responding that occurred in the last minute of the extinction component was scored to ensure that any possible extinction burst observed at the beginning of the component did not interfere with the results of the assessment. Reinforcer sessions lasted 1 min, with the programmed consequence provided after each target response. For the praise condition, the teacher delivered short statements (e.g., “great job, Eddie”) contingent on each response. For the token or edible condition, the teacher delivered a token contingent on each target response. When the participant had earned all six tokens, the teacher stopped the timer and prompted the participant to trade in his tokens for an edible item. Once the edible item was consumed, the timer started again. For Frank, this condition was modified because he did not have a token economy as part of his daily program. Therefore, instead of earning a token for each response, Frank earned a small piece of a highly preferred food for each target response. For the token or edible item plus praise condition, procedures were identical to the token or edible item condition except that the teacher delivered praise contingent on each target response and on delivery of the trade-in items or edible items (Frank only).

Reinforcer Assessment 2: Concurrent chains. A concurrent-chains preference assessment was conducted to assess participants’ relative preferences for the reinforcers. The reinforcing efficacy of each stimulus also was determined by comparing the response rate during each reinforcer component to the response rate during the last minute of the previous extinction component. Preference was determined by calculating the percentage of trials each stimulus was selected. During this assessment, four small clear plastic bins were placed on the table in front of the participant. Each bin was placed upside down over the discriminative stimuli associated with each condition. For Frank, we used the same colored pieces of construction paper as those in the previous assessment. For Eddie, Bill, and Lucas, the bin associated with the token condition contained a picture of the token and the bin associated with praise contained a smiley face. The bin associated with the token plus praise contained both a picture of a token and a picture of a smiley face, but both pictures were taped together so that the participant could select them at the same time. The bin associated with extinction contained a red square.

Before beginning the assessment, each participant was exposed to a pretraining session consisting of 40 exposure trials (10 for each stimuli) during which the teacher manually guided the participant to engage in the target response. The teacher also conducted one exposure trial for each initial link before the assessment session. During the session, the teacher said “choose” at the beginning of each trial, provided the consequence associated with the selected bin, and initiated another trial after delivering the reinforcer. Each session consisted of 20 trials, and placement of the bins was rotated after each trial. One session was completed with each participant. For Eddie, the assessment was repeated because of the lack of response differentiation.

Training. The teacher taught the participants to match printed words to their corresponding picture using a match-to-sample procedure. Stimuli were nine printed Portuguese words (Table 1). Portuguese words were selected to eliminate the possibility of participants’ previous experience with these stimuli. Each set of three words were randomly assigned to one of three reinforcement conditions. The first author and a master’s-level behavior analyst selected words of similar length and letter sequence to promote similar task difficulty across conditions. On the table in front of the participant were a slant board and the discriminative stimuli associated with the relevant reinforcement condition (the same as in the reinforcer assessment). At the beginning of each trial, the teacher showed the participant the sample word and said “match.” After the participant emitted the required observing response (e.g., touch the printed word), the comparison stimuli (pictures of the items described by the printed words) were randomly presented on a three-stimulus array slant board in front of the participant. The teacher followed the prompting procedure as prescribed in the beginning of the session: Step 1 consisted of immediate full manual guidance; Step 2 was a 2-s delay with manual guidance at the forearm; Step 3 was a 2-s delay with manual guidance at the upper arm; Step 4 was a 2-s delay with light touch; and Step 5 was no prompts. If the participant touched the correct comparison, the teacher delivered the programmed consequence and then recorded the data. If the participant touched the incorrect comparison, the teacher removed the comparison stimuli from the slant board and then recorded data. An error-correction procedure (e.g., prompting the correct response) was not included to rule out possible avoidance of manual guidance. Attempts by the participant to touch additional comparison stimuli were blocked or ignored. The programmed consequence followed prompted responses until the participant emitted the first independent response.

Table 1

Printed Words Used During the Training Phase

Participant	Condition	Stimuli
Bill	No DR	Bolsa, cama, meia
	DR high/mod	Bolo, carro, melao
	DR high/ext	Bone, calca, medalia
Eddie	No DR	Bolo, carro, melao
	DR high/mod	Bone, calca, medalia
	DR high/ext	Bola, casa, mesa
Lucas	No DR	Bolo, carro, melao
	DR high/mod	Bolsa, cama, meia
	DR high/ext	Bone, calca, medalia
Frank	No DR	Bolsa, cama, meia
	DR high/mod	Bola, casa, mesa
	DR high/ext	Bolo, carro, melao

Note. Each set of stimulus was randomly assigned to one of the reinforcement conditions for each of the participants.

Sessions consisted of nine trials unless the participant met the criterion to discontinue a session. The criterion to increase a step at the end of the session was seven of nine correct responses, and the criterion to discontinue a session and begin another session at the previous prompt step was two consecutive errors or three errors in the same session. Mastery criterion originally was set at two consecutive sessions with at least eight of nine correct and independent responding (see Eddie, Set 1). We then established a more stringent mastery criterion of three consecutive sessions at eight of nine correct or above (Eddie, Lucas, and Bill) to insure that acquisition had occurred. The modification was deemed necessary when Eddie’s performance with Set 2 decreased after he met the original mastery criterion. For Frank, an even more stringent criterion was selected (four sessions with at least eight of nine correct) due to an abrupt increase in independent and correct responding. If, after meeting the mastery criterion for one set of words, the participant had not made significant progress with the remaining sets, the reinforcement program for one unmastered set was changed to the most effective reinforcement program. This procedure was repeated with the remaining set if acquisition had not occurred when the mastery criterion was met for the second set of words. We used an adapted alternating treatments design (Sindelar, Rosenberg, & Wilson, 1985) to compare responding under the three reinforcement conditions.

No differential reinforcement (no DR). The most potent and preferred reinforcer was delivered contingent on prompted and independent responses (token plus praise for Eddie, Bill, and Lucas). For Frank, the most potent reinforcer was edible item plus praise; however, he always chose the edible item only during the concurrent-chains preference assessment. Therefore, an edible item was delivered for prompted and independent responses in the no-DR condition.

Differential Reinforcement 1 (high/mod). A moderately potent and preferred reinforcer (praise) was delivered contingent on prompted responses, and the most potent and preferred reinforcer (praise plus token for Eddie, Lucas, and Bill; edible item for Frank) was delivered contingent on independent responses.

Differential Reinforcement 2 (high/ext). No reinforcement was delivered contingent on prompted responses, and the most potent and preferred reinforcer (praise plus token for Eddie, Lucas, and Bill; edible item for Frank) was delivered contingent on independent responses.

Results

Figures 1, 2, and 3 show the results of the multiple-schedule and concurrent-chains assessments for all four participants. Data depicted on the top panel of Figure 1 indicate that the response frequency for Eddie was highest in the token plus praise component compared to the other components. For Bill, response frequency was highest during the token alone component. For Lucas response frequency was the same across components. Responding was highest in the edible item plus praise condition for Frank. These data suggest that all three stimuli functioned as reinforcers. Figure 2 displays the results of the concurrent-chains assessment for Bill, Frank, and Lucas. Lucas and Bill showed a preference for the token plus praise condition, whereas Frank chose the edible condition at every opportunity. Results for Eddie’s concurrent-chains preference assessment are presented in Figure 3. Eddie chose each condition a similar number of times, suggesting that his responding was not under discriminative control. Therefore, we completed another session of the multiple-schedule assessment, and the results are shown in the bottom panel of Figure 3. Similar to the previous assessment (Figure 1), response frequency was higher under the token plus praise condition.

Figure 1. Response frequency during the multiple-schedule assessment of token, praise, and token and praise for Eddie, Bill, and Lucas, and edible item, praise, and edible item and praise for Frank.

Figure 2. Percentage of selections for Bill and Lucas (top panels) and Frank (bottom panel) during the concurrent-chains assessment.

Figure 3. Percentage of selections during the initial concurrent-chains assessments for Eddie (top panels) and response frequency during the additional multiple-schedule assessment for Eddie (bottom panel).

Figures 4, 5, 6, and 7 show the results of training for each participant. Data are displayed in a multiple baseline design across behaviors to facilitate visual inspection. Eddie (Figure 4) mastered the set of stimuli taught using the DR high/mod procedure more rapidly than the set of stimuli taught using other procedures. In addition, Eddie made immediate progress on the remaining sets after the reinforcement procedures were changed to DR high/mod. Similar results were obtained for Bill (Figure 5). Similarly, Lucas (Figure 6) met the mastery criterion for the DR high/mod condition first. When the teacher switched to the DR high/mod procedure for the stimuli originally associated with the DR high/ext condition, Lucas met the mastery criterion under the no-DR condition. These results suggest that the effects of the DR high/mod condition generalized to the no-DR condition or that both procedures were effective. Results for Frank are depicted in Figure 7 and they differ slightly from those of Bill and Lucas. Frank met the mastery criterion for the DR high/ext condition first. Therefore, the teacher replaced the no-DR procedure with the DR high/ext procedure. Frank acquired this second set of words and simultaneously met the mastery criterion for the DR high/mod condition.

Figure 4. Percentage of independent responses for Eddie when the teacher delivered the highest preference reinforcer (no DR), moderate-preference reinforcer (DR high/mod), and no reinforcer (DR high/ext) for prompted responses, while delivering the highest preference reinforcer for independent responses.

Figure 5. Percentage of independent responses for Bill when the teacher delivered the highest preference reinforcer (no DR), moderate preference reinforcer (DR high/mod), and no reinforcer (DR high/ext) for prompted responses, while delivering the highest preference reinforcer for independent responses.

Figure 6. Percentage of independent responses for Lucas when the teacher delivered the highest preference reinforcer (no DR), moderate preference reinforcer (DR high/mod), and no reinforcer (DR high/ext) for prompted responses, while delivering the highest preference reinforcer for independent responses.

Figure 7. Percentage of independent responses for Frank when the teacher delivered the highest preference reinforcer (no DR), moderate preference reinforcer (DR high/mod), and no reinforcer (DR high/ext) for prompted responses, while delivering the highest preference reinforcer for independent responses.

Acquisition did not occur in the no-DR condition for any of the participants (except for Lucas following acquisition of Sets 1 and 2 with DR high/mod) until the effective DR condition was implemented for that set of stimuli. For Eddie, Bill, and Lucas, the DR condition under which a moderately preferred reinforcer was provided for prompted responses was most effective, whereas for Frank, the DR condition under which no reinforcer was provided for prompted responses was most effective. Thus, based on our participants’ performance during training and their pre-experimental history of prompt dependency (i.e., teacher reports, failure to initiate responses on at least 80% of pre-experimental trials), results of the current study suggest that differential reinforcement that favors independent responses can facilitate acquisition of discrimination tasks and decrease prompt dependency. The most effective arrangement of differential reinforcement, however, may differ across learners.

The present findings replicate and expand on results of past research that has evaluated the effects of differential reinforcement. These data support the results of Olenick and Pear (1984) and Karsten and Carr (2009), who also demonstrated that differential reinforcement of prompted and independent responses facilitated skill acquisition. As in the study completed by Karsten and Carr, the current investigation extends previous research (Olenick & Pear; Touchette & Howard, 1984) by manipulating reinforcer quality instead of rate. The study also extends Karsten and Carr by including a second differential reinforcement condition (DR high/ ext) that was most effective for one participant. Finally, the current investigation is the first to recruit participants with a teacher-reported history of and pre-experimental performance consistent with prompt dependency.

Differential reinforcement may be particularly important for individuals whose responding appears to be dependent on prompts, especially if the dependence is due to a history of receiving the same consequence for prompted and unprompted responses. If differential reinforcement is selected to prevent or remediate prompt dependency, practitioners should systematically assess the reinforcing efficacy of the stimuli delivered for prompted and independent responding. In practice, a multiple-schedule assessment (Reinforcer Assessment 1 in the current study) should suffice as long as differential responding is observed across items. It may also be important to consider the extent of each participant’s history of prompt dependency. In the current study, the procedure that had the greatest disparity between consequences for prompted and independent responses was most effective for Frank. Because Frank was the oldest of the participants, he may have had a longer history of reinforcement for prompted responses.

Several limitations of the current investigation should be noted. First, a baseline condition was not included during training. Although baseline data could have helped to demonstrate that responding was similar across conditions prior to training, we did not include baseline because the participants reportedly did not have any history with the training stimuli. Results of the training phase for Frank also suggest that some carryover effects may account for improved performance on Set 3. A second possibility is that the increase in performance for Set 3 was not the result of carryover effects, but simply a delayed pattern of acquisition reflected for all three sets. Lastly, our mastery criterion differed across participants due to variability in performance. Future studies should select a more stringent criterion from the onset.

Practitioners who are considering the implementation of differential reinforcement for skill acquisition should apply the methods employed here with caution. The current study employed a constant delay for prompt fading. This procedure allowed the participant to emit independent responses before prompt delivery. However, delayed prompts also allowed the participant to emit errors. Although these errors did not seem to hinder learning, previous research suggests that an errorless learning procedure may lead to better attending and accuracy (Terrace, 1963). Practitioners should, therefore, insure that the prompt type and fading procedures selected are effective for each of their clients (for reviews, see Demchak, 1990; Libby et al., 2008).

Future studies should continue to assess the effects of differential reinforcement on skill acquisition as well as alternative methods for addressing prompt dependency. Because research on prompt dependency is scarce, future studies should focus on determining the variables responsible for the development of prompt dependency and, subsequently, ways to prevent its development.

References

Clark, K. M., & Green, G. (2004). Comparison of two procedures for teaching dictated-word/symbol relations to learners with autism. Journal of Applied Behavior Analysis, 37, 503–507. https://doi.org/10.1901/jaba.2004.37-503

Demchak, M. (1990). Response prompting and fading methods: A review. American Journal on Mental Retardation, 94, 603–615.

Fisher, W. W., Kodak, T., & Moore, J. W. (2007). Embedding an identity-matching task within a prompting hierarchy to facilitate acquisition of conditional discriminations in children with autism. Journal of Applied Behavior Analysis, 40, 489–499. https://doi.org/10.1901/jaba.2007.40-489

Karsten, A. M., & Carr, J. E. (2009). The effects of differential reinforcement of unprompted responding on skill acquisition of children with autism. Journal of Applied Behavior Analysis, 42, 327–334. https://doi.org/10.1901/jaba.2009.42-327

Libby, M. E., Weiss, J. S., Bancroft, S., & Ahearn, W. H. (2008). A comparison of most-to-least and least-to-most prompting on acquisition of solitary play skills. Behavior Analysis in Practice, 1, 37–43.

Olenick, D. L., & Pear, J. J. (1980). The differential reinforcement of correct responses to probes and prompts in picture-name training with severely retarded children. Journal of Applied Behavior Analysis, 13, 77–89. https://doi.org/10.1901/jaba.1980.13-77

Oppenheimer, M., Saunders, R. R., & Spradlin, J. E. (1993). Investigating the generality of the delayed-prompt effect. Research in Developmental Disabilities, 14, 425–444. https://doi.org/10.1016/0891-4222(93)90036-J

Sindelar, P., Rosenberg, M., & Wilson, R. (1985). An adapted alternating treatments design for instructional research. Education and Treatment of Children, 8, 67–76.

Smaby, K., MacDonald, R. P. F., Ahearn, W. H., & Dube, W. V. (2007). Assessment protocol for identifying preferred social consequences. Behavioral Interventions, 22, 311–318. https://doi.org/10.1002/bin.242

Terrace, H. S. (1963). Discrimination learning with and without “errors.” Journal of the Experimental Analysis of Behavior, 6, 1–27. https://doi.org/10.1901/jeab.1963.6-1

Touchette, P. E., & Howard, J. S. (1984). Errorless learning: Reinforcement contingencies and stimulus control transfer in delayed prompting. Journal of Applied Behavior Analysis, 17, 175–188. https://doi.org/10.1901/jaba.1984.17-175

Received June 12, 2012
Final acceptance February 27, 2013
Action Editor, Amanda Karsten

The Effects of a Procedure to Decrease Motor Stereotypy on Social Interactions in a Child With Autism Spectrum Disorder

Jan 2, 2026

Lisa Tereshko & Robert K. Ross & Lauren Frazee
Accepted: 14 September 2020
© Association for Behavior Analysis International 2021

Abstract

Repetitive and stereotypic motor movements and vocal behavior are among the diagnostic characteristics of autism spectrum disorder (American Psychiatric Association, 2013, Diagnostic and Statistical Manual of Mental Disorders [5th ed.]. Washington, DC: Author). Motor stereotypy can interfere with the acquisition and demonstration of many adaptive skills and may socially stigmatize individuals, limiting the development and maintenance of peer relationships. The current study evaluated the effects of a differential reinforcement procedure used to establish discriminative stimulus control over the rate of motor stereotypy. In the second experimental phase, the child was taught a multistep self-management program using the differential reinforcement procedure. The data indicate that the procedure was effective in decreasing the rate of motor stereotypy across all evaluated settings for an increased duration. Although motor stereotypy was not completely eliminated by the procedure, a large reduction in rate was observed, as well as a large increase in the initiation of and response to social interactions. The findings are discussed in terms of social validity and the establishment and transfer of stimulus control.

Keywords Differential reinforcement . Motor stereotypy . Self-management . Social interactions . Stimulus control

Lisa Tereshko
ltereshko@beaconservices.org; https://orcid.org/0000-0002-3932- 5131
1 Beacon ABA Services Inc., Milford, MA 01757, USA
2 Endicott College, Beverly, MA, USA
3 Present address: Lawrence Public Schools, Lawrence, MA, USA
4 Cambridge College, Cambridge, MA, USA

Repetitive and stereotypic motor movements and vocal behavior are diagnostic characteristics of autism spectrum disorder (ASD; American Psychiatric Association, 2013). Many studies suggest that the prevalence of children with ASD who display stereotypy is high; however, specific prevalence studies have not been conducted (Lanovaz, Robertson, Serono, & Watkins, 2013). Repetitive and stereotypic behaviors include different topographies (vocal and motor) such as noncontextual speech (Ahearn, Clark, MacDonald, & Chung, 2007; Crutchfield, Mason, Chambers, Wills, & Mason, 2015), perseverative speech (Rehfeldt & Chambers, 2003), arm or hand flapping (Crutchfield et al., 2015), lining up objects (Boyd, McDonough, & Bodfish, 2012), mouthing (Crutchfield et al., 2015), and body rocking (Mulligan, Healy, Lyndon, Moran, & Foody, 2014). These behaviors can interfere with the individual’s ability to appropriately interact in social situations (Boyd et al., 2012; Loftin, Odom, & Lantz, 2008; Wilke et al., 2012), be socially stigmatizing (DiGennaro Reed, Hirst, & Hyman, 2012; Loftin et al., 2008), decrease opportunities for interactions with peers (DiGennaro Reed et al., 2012), have social impacts in general education placement (Loftin et al., 2008), reduce the individual’s ability to attend to academic instructions (Ahearn et al., 2007; Boyd et al., 2012; Loftin et al., 2008), interfere with appropriate engagement in toy play (Loftin et al., 2008), negatively affect family engagement (Boyd et al., 2012; Wilke et al., 2012), and limit engagement in vocational activities (Wilke et al., 2012).

To assist with the development of effective and valid interventions, research suggests the implementation of a functional assessment prior to starting an intervention (Iwata et al., 2000). Functional assessments are not only recommended for research purposes but also required prior to intervention for problem behavior, according to the Behavior Analyst Certification Board’s (2014) Professional and Ethical Compliance Code for Behavior Analysts. However, many published studies omit this requirement. One review by DiGennaro Reed et al. (2012) found that 56% of studies reviewed on the topic of stereotypy did not utilize a functional assessment. Another study that conducted an assessment with 53 children with ASD demonstrated that 90% of the individuals’ stereotypy was maintained by automatic reinforcement (Wilke et al., 2012), and in a review by Hanley, Iwata, and McCord (2003), it was noted the stereotypy was automatically maintained in 63% of reviewed articles. If the behavior is identified to be maintained by automatic reinforcement through functional assessment procedures, the specific sensory consequence of that behavior then needs to be identified with further assessments (Rincover, Cook, Peoples, & Packard, 1979).

Some barriers to implementing a treatment based on automatically reinforced stereotypy include determining the particular response–reinforcer relation maintaining the response form, controlling the delivery of the reinforcer for the response, and determining and providing a functionally equivalent reinforcer for a socially acceptable behavior (Potter, Hanley, Augustine, Clay, & Phelps, 2013). Oftentimes, these barriers result in researchers implementing a non-functionbased intervention for automatically maintained stereotypy (Mulligan et al., 2014). A review by Mulligan et al. (2014) noted that function-based treatments were only identified in just over half, 37 of 71, of the articles reviewed. A variety of function-based antecedent and consequence
strategies have been implemented to reduce the occurrence of stereotypy (DiGennaro Reed et al., 2012). In a review of stereotypy treatments, DiGennaro Reed et al. (2012) noted the frequent use of a combination of approaches. Antecedent interventions that include matched or unmatched stimulation and environmental enrichment have demonstrated effectiveness in the treatment of stereotypy (Mulligan et al., 2014). In other studies, reinforcement or skills-based interventions have demonstrated effectiveness to reduce rates of stereotypy, such as the use of differential reinforcement, self-management, functional communication training, and play skills training (Mulligan et al., 2014).

Many treatments for automatically maintained motor stereotypy consist of using differential reinforcement procedures to limit reinforcement when stereotypy occurs while also providing reinforcement for alternative or other adaptive socially appropriate skills (Lanovaz & Argumedes, 2010; Nuernberger, Vargo, & Ringdahl, 2013). A review article by Chowdhury and Benson (2011) discussed various differential reinforcement procedures used to treat stereotypy and found that these procedures were successful in reducing stereotypy. Differential reinforcement procedures are less intrusive treatment options compared to response-blocking and punishment procedures because they are based on reinforcement and effective in reducing problem behaviors and limit interruption to ongoing activities (Chowdhury & Benson, 2011). When stereotypy occurs at a high rate, two limitations of differential reinforcement procedures include continuous involvement from the caregiver and low levels of reinforcement due to limited intervals with the absence of stereotypy (Chowdhury & Benson, 2011).

Adding a stimulus to signal the use of a differential reinforcement procedure increases the stimulus control of the procedure (Haley, Heick, & Luiselli, 2010; Langone, Luiselli, & Hamill, 2013). These procedures require teaching the individual that in the presence of a specific stimulus, delivery of (or access to) reinforcement is contingent on the behavior not occurring. The stimulus can be an additional auditory stimulus, such as a tone, or a visual stimulus, such as a colored card or bracelet. Haley et al. (2010) examined the effects of a stimulus control procedure with a child with ASD in a general education classroom using colored cards as the discriminative stimulus. The procedure successfully reduced the child’s vocal stereotypy in the target setting, and the stimulus control generalized to a second setting (Haley et al., 2010). Similarly, Langone et al. (2013) studied the effects of a stimulus control procedure that utilized response blocking by having the participant wear a tennis wristband as the discriminative stimulus. They found that wearing the tennis wristband, even without response blocking implemented, maintained low rates of the participant’s motor stereotypy (Langone et al., 2013).

The use of discriminative stimuli can also be implemented as part of a self-management program (Cooper, Heron, & Heward, 2020). Self-management involves teaching the individual to observe his or her own behavior and apply behavior change strategies. Oftentimes, the discriminative stimulus will cue the individual for responses in the future, such as setting an alarm or writing a reminder. The observation of one’s own behavior and recording its occurrence or nonoccurrence (self monitoring) have been implemented to change a variety of behaviors, and have been implemented more often than any other self-management strategy (Cooper et al., 2020). Reactivity to the self-monitoring procedure increases the therapeutic effects of self-monitoring (Cooper et al., 2020). However, the therapeutic effect can be further increased when combined with other contingencies (Cooper et al., 2020). Adding consequences, such as reinforcement, increases the
effectiveness of self-management programs and gives the individual control of his or her own behavioral programming (Cooper et al., 2020). The benefits of self-management include the ability for it to be used for an extended amount of time, in the absence of a treatment provider, and in a wide variety of settings (Cooper et al., 2020).

Koegel and Koegel (1990) examined the effectiveness of a self-management procedure to reduce stereotypy in students with profound disabilities. They found that all students learned to use the self-management procedure, and as a result, all of their stereotypic behaviors reduced in rate. The procedure generalized to a new setting and in the absence of a treatment provider. Fritz, Iwata, Rolider, Camp, and Neidert (2012) replicated and conducted a component analysis of the 368 Behav Analysis Practice (2021) 14:367–377 self-management procedure in Koegel and Koegel (1990) to determine which aspect of the procedure was responsible for the behavior change. They found that the components that resulted in the decrease were instructional control or differential reinforcement, but that the component of self-recording had little effect on stereotypy (Fritz et al., 2012).

Some researchers who focused on targeting adaptive (desirable) behaviors for increase have noted a collateral effect of decreased stereotypy. Pierce and Schreibman (1994) noted that upon a treatment designed to increase daily living skills, a decrease in stereotypic behaviors was observed without being directly targeted. Loftin et al. (2008) examined a procedure to increase social interactions in children with ASD that also resulted in a reduction in their motor stereotypy. Given the limitations that motor stereotypy presents to the individual, it is essential that researchers continue to assess and treat motor stereotypy either as the target behavior of the intervention or as a collateral behavior.

Due to the high prevalence of stereotypy in children with ASD and the significant social impact of the behavior, it is critical that more research be completed. This research should focus on determining effective procedures that reduce stereotypic behaviors and enhance social interactions among these individuals and their peers, families, and communities. The purpose of the current investigation was to evaluate the effects of a stimulus control and self-management procedure on the rate of motor stereotypy in a child with ASD, as well as the collateral effects the procedure had on social interactions.

Method

Participant, Setting, and Materials

Luke was a 5-year-old Caucasian male diagnosed with ASD. He was a member of a middle-class English-speaking family. He was receiving intensive home-based applied behavior analysis through the course of the study and attended a fullday integrated preschool program through his city’s public school (a suburb in the northeastern United States). He was a verbal communicator and was able to spontaneously mand and tact, as well as emit intraverbal behavior of various forms. No formal interventions to address his motor stereotypy had been implemented prior to this study, as other interfering behaviors were successfully targeted for decrease. Motor stereotypy was reported by Luke’s clinical team and parents to occur at unacceptable levels that interfered with his participation in social activities both within and outside the home. It was reported by Luke’s clinical team and family that Luke did not interact with peers or adults while engaged in motor stereotypy. In approximately one year from the start of the study, Luke was to begin school in a mainstream classroom. Therefore, his parents and clinical team determined his motor stereotypy was a priority that needed to be addressed.

Sessions for the functional behavior assessment and baseline and treatment conditions, for all phases of the study, were conducted in Luke’s bedroom (where Luke’s home services typically occurred). The bedroom was furnished with his bed, a dresser, a bookshelf full of books, a trunk full of toys, a small table, and a set of chairs. All sessions were videotaped in order to be scored. Generalization probes were conducted in other rooms of his home, his yard, and the community. Materials used included leisure items, a bracelet (or watch), visuals, a timer, a self-management token board, and preferred items as identified in the preference assessment.

Response Definitions and Interobserver Agreement

Phase 1 The primary dependent variable was motor stereotypy. Motor stereotypy was defined as any episode of Luke putting one or more of his fingers in contact with the palm of his hand(s), in a tapping or clenching motion, or waving his hands at the wrist by twisting his hands up and down or side to side for three consecutive seconds or longer. Episodes ended when he was no longer engaged in the hand movements for three consecutive seconds. The independent variable for Phase 1 was the bracelet discrimination procedure using differential reinforcement of alternative behaviors.

Data on motor stereotypy were collected, for the functional assessment and Phase 1, using a 10-s partial-interval data sheet. Partial-interval data collection was selected due to Luke’s engagement in motor stereotypy occurring at various durations, and occurrences may have been underestimated if other interval measurements were used. The percentage of intervals with motor stereotypy was calculated.

Phase 2 For Phase 2, the primary dependent variable was the rate of motor stereotypy, and the secondary dependent variable was the duration of treatment. The duration of treatment was defined as the amount of time Luke wore the bracelet and implemented the self-management procedure. The independent variable for Phase 2 was the implementation of a self-management procedure with the bracelet discrimination procedure. During baseline and treatment conditions in Phase 2, the rate of motor stereotypy and the duration of treatment were measured.

Phase 3 For Phase 3, data were collected on the rates of social initiations and social responding. Social initiations were defined as any instance of Luke spontaneously emitting a vocal statement paired with eye gaze toward the communicative partner and/or the presentation of an item. Social responding was defined as Luke emitting a vocal and/or motor behavior within 5 s of the presentation of a vocal question or comment from the communicative partner. Data for Phase 3 were collected via videotaped sessions of Phase 1 and Phase 2. During Phase 3, data were collected on the rate of social initiations per minute and the percentage of responding per opportunity for social responding.

Interobserver agreement Interobserver agreement (IOA) data were collected by a secondary observer who independently scored 50% of sessions for the functional analysis, 45% of sessions for Phase 1, 25% of sessions for Phase 2, and 35% of sessions for Phase 3. The secondary observer collected data in vivo for the functional analysis, in vivo or via video for Phase 1 and Phase 2, and via video for Phase 3. IOA was calculated by dividing the number of agreements by the total number of agreements plus disagreements and multiplying by 100. Agreement was high across the functional assessment and all three phases of the study and averaged 97% (range 85%–100%).

Experimental Design

For Phase 1, the effects of the bracelet discrimination procedure on the percentage of occurrence of motor stereotypy were evaluated using an A-B-A-B reversal design. Condition A referred to baseline, and Condition B to treatment (bracelet discrimination procedure). For Phase 2, the effects of the self-management procedure on the rate of stereotypy were evaluated using a changing-criterion design. For Phase 3, the effects of the procedure to reduce motor stereotypy on the occurrence of social interactions were examined using an AB-A-B reversal design. Condition A referred to baseline, and Condition B to treatment (procedure to reduce motor stereotypy).

Functional Behavior Assessment

A multiple-stimulus without-replacement (MSWO) preference assessment (DeLeon & Iwata, 1996) was conducted with a variety of toys. The leisure item with the highest score was used for all training sessions and during the functional assessment. For Luke, the highest scored item was the iPad. Lower scoring items, such as action figures and books, were used during generalization trials.

A functional behavior assessment was conducted to determine the primary function of the target behavior. A descriptive assessment consisting of observational data on motor stereotypy was collected using the Beacon Consequence Analysis Form (a direct observation data sheet). This data sheet enables the observer to record the occurrence or nonoccurrence of any of the four consequences (attention, escape, tangibles, or no environmental change/automatic) provided to Luke immediately following the occurrence of the target behavior. Thus, the results of these data provide objective information from which to develop a hypothesis of function. The consequences most frequently noted after the occurrence of motor stereotypy suggested either an attention or sensory function of Luke’s stereotypy (data are not presented but are available upon request from the first author).

In order to clarify a possible primary function, a free operant modified preference assessment was conducted (Roane, Vollmer, Ringdahl, & Marcus, 1998). In the free operant modified preference assessment, Luke was given free access to either sit at his table or sit on his bed. Each area was assigned one of the tested consequences (attention and sensory). The locations for each consequence were randomized across sessions, and Luke was verbally informed about which area resulted in which consequence at the start of each session. If Luke did not select one of the areas, he would have been prompted to make a selection, but this did not occur. The percentage of intervals that Luke selected a consequence and the percentage of stereotypy were scored. Results of the functional behavior assessment (Fig. 1) suggest that stereotypy was maintained by automatic reinforcement.

Phase 1 Procedures

Baseline Sessions were 5 min in length and were conducted 1 to 2 days per week, based on participant availability, with one to two sessions per day. Luke was given free access to the leisure item regardless of motor stereotypy. The staff and Luke’s family members were instructed to continue with what they would typically do when he engaged with the leisure item but no one was to react or respond to occurrences of motor stereotypy. This was selected as the baseline condition as it closely represented what typically occurred.

Treatment Sessions were 5 min in length and were conducted 1 to 2 days per week, based on service session length and participant availability, with three sessions per day. At the start of each treatment session, the staff put the bracelet on Luke’s wrist and presented a visual to review the intervention condition rule. This rule was presented on a piece of lined notebook paper and indicated with a simple drawing that no motor stereotypy equaled access to the iPad. Immediately following the review, if Luke demonstrated calm hands and the absence of motor stereotypy, he was given access to his iPad, which he maintained access to as long as he continued to display calm hands. Upon the occurrence of motor stereotypy, the iPad was removed until Luke had 15 s of calm hands and the absence of motor stereotypy. Luke did not receive any attention or further instruction when the iPad was removed. Upon 15 s of calm hands and the absence of motor stereotypy, the iPad was represented and remained presented as long as Luke demonstrated calm hands and the absence of motor stereotypy or the session ended after 5 min elapsed.

Fig. 1 Results of the functional behavior assessment; Stereo = stereotypy

Generalization Generalization probes were conducted throughout the treatment condition. The generalization probes varied from the treatment sessions in the following ways: the reinforcing items used (items from the MSWO that scored lower than the iPad), the setting (other rooms within Luke’s home), and the people present (family and staff members). First, generalization focused on the location within his room, the floor and his bed, and with toys at the table in his bedroom. Then, generalization sessions occurred in other locations in the house, the dining room and living room, and then with his parent running the sessions.

Phase 2 Procedures

Baseline Throughout the baseline phase, sessions were 5 min in length, conducted 1 to 2 days per week with one to two sessions per day. Baseline sessions for Phase 2 were conducted in a manner identical to baseline for Phase 1.

Treatment Due to Luke’s upcoming transition to a mainstream classroom, as well as the removal of his one-to-one staffing support, it was important for the treatment of his stereotypy to increase in duration and for Luke to self-manage his own intervention. To assist with this transition, a self management program was implemented using the bracelet discrimination procedure from Phase 1.

Sessions were conducted one to two times per week with three to six sessions per day when the interval duration was 3 min or less, and one to three sessions per day when the interval duration was 4 min or greater. At the start of each treatment session, the staff put the bracelet on Luke’s hand and presented Luke with his self-management token board and timer. The staff notified Luke of the timer interval to be used and wrote it on the token board. Luke then set his timer to the noted interval. At the end of each interval, Luke stopped his timer and gave himself a token for each step completed of the self-management program (setting the timer, stopping the timer, and not engaging in stereotypy). No further recording by Luke was required.

The initial interval used for the self-management procedure was 1 min. This was chosen as there were five intervals per token board and he had previously demonstrated low rates of stereotypy for 5 min during Phase 1. Intervals were then systematically increased as the criterion was met. The criterion for increase was 2 consecutive days with zero rates of stereotypy for a minimum of five sessions total. As the criterion was met, the interval was increased by 1 min. Upon the completion of the 4-min interval, Luke was given the opportunity to choose the interval duration. This was done because research suggests that providing the participant with a component of choice may increase the acceptability of the intervention by the participant and enhance the development of self-control (Dixon and Tibbetts, 2009).

For each interval of the self-management program, Luke had the opportunity to earn tokens. He could earn a token for each interval for setting the timer, stopping the timer, and having no occurrences of motor stereotypy. If motor stereotypy occurred, the interval was restarted, and he was told he could try again. Upon the completion of the token board, Luke received access to the iPad without the bracelet, which signaled the availability to engage in motor stereotypy, for 5 min. The removal of the bracelet (access to motor stereotypy) and delivery of the iPad were used as the terminal reinforcers of the self-management procedure. These reinforcers were not used throughout the self-management procedure, as was done in Phase 1, in order to increase engagement with other activities occurring throughout his day while maintaining low levels of stereotypy. This also increased the social validity of the procedure, as it reduced the duration of iPad engagement, increased engagement with other more socially interactive activities, and increased the duration without stereotypy.

Generalization The generalization condition was implemented to extend the settings and people present, in order to increase Luke’s participation in his community without stigma while maintaining the intervention’s effect. The generalization condition was conducted identically to the treatment condition except that the settings varied. The settings used were Luke’s home (living room, kitchen, dining room, backyard) and community (grocery store, park, playground, friend’s house), and the people extended to new staff, parents, and a grandparent.

Token fading To further reduce the social stigma of stereotypy and the intervention in place, a token-fading procedure was implemented. The self-management token-fading condition consisted of Luke no longer wearing the bracelet, which was replaced by wearing a watch with a vibrating interval timer, and the removal of the self-management token board. The watch was introduced to eliminate the need for an audible timer, thus decreasing the noticeability of the intervention to the public while maintaining a discriminative stimulus similar to the bracelet for Luke. He wore the watch 45 min to 2 hr at a time, one to two times per day. Upon the first session of using the watch, Luke was told the watch was his new bracelet and had the same rules. The interval timer on the watch was set to 20 min. The watch remained on until Luke verbally requested its removal. All requests were honored. If a request occurred while Luke was engaged in an activity in the community, it would have been delayed, but this did not occur. Upon the completion of wearing the watch, Luke received access to the iPad without the watch, which signaled access to motor stereotypy, for 5 min.

Phase 3 Procedures

A postprocedural assessment was conducted via a review of videotaped sessions to evaluate the rate of social initiations and the percentage of occurrence of responses to social bids. The first session of each day was scored. Baseline sessions and Phase 1 sessions were 5 min in duration. For Phase 2, a 10-min probe at the start of the session was used, as the sessions varied in duration from 5 min to 80 min.

Results

Phase 1 Results

Figure 2 depicts Luke’s percentage occurrence of motor stereotypy during all conditions in Phase 1. Stereotypy occurred during an average of 60% (range 30%–90%) of intervals, across the initial baseline condition. During the first treatment condition (bracelet discrimination procedure), the percentage occurrence of motor stereotypy decreased to an average of 3% (range 0%–17%) during training sessions and generalization sessions. A return to baseline resulted in motor stereotypy returning to near-pretreatment levels (mean = 55%; range 30%–83%). The second implementation of the bracelet discrimination procedure resulted in an immediate decrease in the percentage occurrence of motor stereotypy during training and generalization sessions (mean = 1%; range 0%–3%).

Fig. 2 Percentage of occurrence of motor stereotypy across Phase 1; BL = baseline

Phase 2 Results

The top panel of Fig. 3 depicts Luke’s rate of motor stereotypy during all conditions of Phase 2. Luke averaged a rate of 0.87 responses per minute (range 0.4–1.2) across the initial baseline condition. Upon implementation of the 1-min interval of the self-management procedure, the rate per minute of motor stereotypy decreased to an average of 0.01 (range 0–0.10). A slight increase was observed during the 2-min interval with an average rate per minute of 0.04 (range 0–0.30). An increase in responding was observed again when the criteria changed to 3-min intervals (mean = 0.03; range 0–0.29),which resulted in sessions continuing at this level longer than other levels in order to meet the criteria for increase. A slight increase was observed during the implementation of the 4-min interval condition (mean = 0.02; range 0–0.07), but Luke quickly met the criteria to advance. A similar pattern was observed in the varied-interval condition (mean = 0.01; range 0–0.07), where Luke was able to choose the interval length for each interval. Low rates of motor stereotypy were observed in the generalization condition (mean = 0.01; range 0–0.11).

The bottom panel of Fig. 3 depicts the duration that Luke engaged in the self-management procedure per session. Baseline sessions were 5 min in duration. The 1-min interval condition averaged 12 min per session (range 11–20 min). An increase in duration over the anticipated 5 min was observed in the 1-min condition. This was due to Luke independently completing a task prior to stopping the timer or Luke taking his time to select which token he wanted to earn for that interval. The 2-min interval condition increased the duration to an average of 15 min (range 13–18 min). A 4-min increase in average duration was observed in the 3-min interval condition, and a 5-min average increase was observed in the 4-min interval condition. In the varied-interval condition, the average duration was 30 min (range 15–65 min). In the generalization condition, another increase in duration was observed to an average of 52 min per session (range 20–76 min).

Fig. 3 Rate of stereotypy and duration of implementationacross Phase 2. The top panel displays the response per minute of motor stereotypy during Phase 2. The bottom panel displays the duration in minutes for which the self-management procedure was implemented during Phase 2.
Horizontal dotted lines indicate the criteria per condition in the self-management procedure; BL = baseline;
Gen. = generalization

Phase 3 Results

Figure 4 depicts the results of Phase 3. The top panel represents the percentage occurrence that Luke responded to social bids. During baseline sessions, Luke averaged 51.5% responding (range 20%–83%). An increase was noted upon the implementation of Phase 1, with an average of 80% responding (range 33%–100%). A return to baseline resulted in a reduction in responding (mean = 46%; range 33%–56%). During Phase 2, Luke’s responding demonstrated an increase to an average of 94% (range 77%–100%).

The bottom panel is the rate of social initiations. During baseline sessions, Luke averaged 0.5 initiations per minute (range 0.4–0.6). During Phase 1, Luke’s rate of initiations increased to an average rate of 1.35 initiations per minute (range 0–4.8). During the return to baseline, Luke’s rate of initiations reduced to levels previously observed in the initial baseline condition (mean = 0.6; range 0.4–0.8). In Phase 2, Luke’s rate of initiations further increased from levels observed in the previous phase to an average of 2.29 (range 0.8–3.5).

Fig. 4 Social interactions across phases. The top panel displays the percentage of responding per opportunity across phases. The bottom panel displays the rate of social initiations across phases; BL = baseline

Discussion

This study demonstrates that the differential reinforcement procedure using a bracelet as the discriminative stimulus was effective in decreasing rates of motor stereotypy, and that subsequent fading procedures were successful in maintaining low levels of stereotypy. Additionally, the inclusion of a self management component in the procedure was successful in further reducing and maintaining low rates of stereotypy for extended durations and across a range of settings. The duration of self-management with the bracelet discrimination procedure increased throughout the investigation. Luke engaged in the intervention for up to 2 hr and in a range of community settings. Collaterally, social interactions were observed to increase across the intervention phases of the study.

Luke’s independence in following the self-management procedure was initially variable but improved with the introduction of the varied-interval and generalization conditions (data are available upon request from the first author). Moreover, his accuracy of independent correct performance in the self management procedure did not affect his rate of motor stereotypy. This replicates the effect of self-management regardless of accurate reporting as demonstrated by Koegel and Koegel (1990).

Significant findings of this study are the collateral effects of increased social initiations and social responding. Improved social interactions were observed across all conditions. The explanation for this finding is not clear; however, it is possible that the reduction in motor stereotypy may have increased Luke’s availability to attend to social bids, as well as his responsiveness to social interactions. It is important to note that these skills were in his repertoire prior to the study and thus did not need to be established as part of this study. The reduction in motor stereotypy may simply have provided less interference in engaging in these adaptive skills with people in his environment. Luke could be described as a social child prior to this study, thus it could also be assumed that the lack of motor stereotypy made social interactions more available as an effective reinforcer. This finding, that a reduction in motor stereotypy can result in improvements in adaptive social functioning, is significant. Although previous authors have noted that stereotypic behavior impedes social functioning (Boyd et al., 2012; DiGennaro Reed et al., 2012; Loftin et al., 2008; Wilke et al., 2012), previous studies have not demonstrated a direct link between a reduction in motor stereotypy and improved social initiation and responsiveness to social bids.

A further interesting point is that the greatest increase in social interaction was observed in the self-management phase of the current investigation. While Luke was engaged in observing and recording his own behavior, improvements in social behavior appeared to be greatest. Further research should investigate the variables that may have contributed to this increase by conducting a component analysis of the intervention.

This intervention was not able to completely eliminate motor stereotypy from Luke’s repertoire. However, the procedure reduced stereotypy to a level where it did not noticeably interfere with social engagement or occur to a degree that made him stand out significantly from his peers. Moreover, the intervention ultimately enhanced Luke’s ability to self-manage his motor stereotypy. This new ability resulted in increased opportunities for Luke to engage with peers in the community.

Following the token-fading condition, his family was informally surveyed regarding the impact of this intervention. They reported satisfaction with the procedure and the results. They stated they were able to independently and successfully implement the procedure outside of sessions in the home and in novel settings, such as in the community (i.e., grocery store).

At the conclusion of the study, Luke was integrated into a mainstream classroom at his public school. During the school day, he continued wearing the watch (from the token fading phase), and his teacher reported low rates of motor stereotypy and high levels of interactions with his peers.

The shift from an interventionist-implemented procedure to a self-management procedure not only increased the duration the intervention could be in place but also appears to have enhanced its effects. However, this process has a number of logistical implications. To implement the self-management procedure in the community, it required two important modifications: first, a token-fading procedure, and second, the use of a tactile timer (the vibrating watch). The implementation of the token-fading condition assisted with the maintenance of low rates of motor stereotypy, while allowing for ease of application in naturalistic settings such as in the community. It also assisted with reducing the stigma of the audible timer and token board when Luke was with his peers. Further analysis of the transfer of stimulus control to the vibrating interval timer watch should be examined with a systematic investigation using fading of each component. This would add to the current literature and clinically support the aspects required to reduce and maintain low levels of stereotypy.

Another important consideration is that the current investigation was implemented for approximately one year. The procedure involved contrived systematic manipulations to increase the environments in which the procedure was effective and to increase distractions within the environment (i.e., people present) while maintaining its effectiveness. This duration of intervention is consistent with research that has targeted automatically maintained behavior, suggesting that extensive time and effort are required for a change in repertoire of automatically reinforced stereotypy (Potter et al., 2013). Future research should further examine the systematic manipulation of variables to determine the rate that would be effective and efficient for interventions focused on automatically reinforced stereotypy. Additionally, a systematic literature review of the current studies reducing stereotypy and the levels achieved with each intervention would further assist in determining the expected levels of effect with various treatments and the acceptability of the treatment by those affected by it.

Teaching children with ASD to utilize self-management interventions is an important skill, as it can lead to essential social opportunities in society, as demonstrated in this study (Koegel & Koegel, 1990). Replications of the current study are recommended, as the current investigation was only implemented with one participant with strong vocal communication skills, and further support is needed to determine the effect across participants and participant characteristics.

Funding No funding was provided for this project

Compliance with Ethical Standards

Conflict of interest We have no known conflicts of interest to disclose.

Ethical approval Study-specific approval was granted from an ethics committee due to the research involving humans.

Informed consent Informed consent to participate in the research was received.

References

Ahearn, W. H., Clark, K. M., MacDonald, R. P. F., & Chung, B. I. (2007). Assessing and treating vocal stereotypy in children with autism. Journal of Applied Behavior Analysis, 40(2), 263–275. https://doi.org/10.1901/jaba.2007.30-06

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Washington, DC: Author.

Behavior Analyst Certification Board. (2014). Professional and ethical compliance code for behavior analysts. https://www.bacb.com/wp-content/uploads/BACB-Compliance-Codeenglish_190318.pdf

Boyd, B. A., McDonough, S. G., & Bodfish, J. W. (2012). Evidence-based behavioral interventions for repetitive behaviors in autism. Journal of Autism and Developmental Disorders, 42(6), 1236–1248. https://doi.org/10.1007/s10803-011-1284-z

Chowdhury, M., & Benson, B. A. (2011). Use of differential reinforcement to reduce behavior problems in adults with intellectual disabilities: A methodological review. Research in Developmental Disabilities, 32, 383–394. https://doi.org/10.1016/j.ridd.2010.11.015

Cooper, J. O., Heron, T. E., & Heward, W. L. (2020). Applied behavior analysis (3rd ed.). Upper Saddle River, NJ: Pearson Education.

Crutchfield, S. A., Mason, R. A., Chambers, A., Wills, H. P., & Mason, B. A. (2015). Use of a self-monitoring application to reduce stereotypic behaviors in adolescents with autism: A preliminary investigation of I-Connect. Journal of Autism and Developmental Disorders, 45, 1146–1155. https://doi.org/10.1007/s10803-014-2272-x

DeLeon, I. G., & Iwata, B. A. (1996). Evaluation of a multiple-stimulus presentation format for assessing reinforcer preferences. Journal of Applied Behavior Analysis, 29, 519–533. https://doi.org/10.1901/jaba.1996.29-519

DiGennaro Reed, F. D., Hirst, J. M., & Hyman, S. R. (2012). Assessment and treatment of stereotypic behavior in children with autism and other developmental disabilities: A thirty-year review. Research in Autism Spectrum Disorders, 6, 422–430. https://doi.org/10.1016/j.rasd.2011.07.003

Dixon, M. R., & Tibbetts, P. A. (2009). The effects of choice on self-control. Journal of Applied Behavior Analysis, 42(2), 243–252. https://doi.org/10.1901/jaba.2009.42-243

Fritz, J. N., Iwata, B. A., Rolider, N. U., Camp, E. M., & Neidert, P. L. (2012). Analysis of self-recording in self-management interventions for stereotypy. Journal of Applied Behavior Analysis, 45(1), 55–68. https://doi.org/10.1901/jaba.2012.45-55

Haley, J. L., Heick, P. F., & Luiselli, J. K. (2010). Use of antecedent intervention to decrease vocal stereotypy of a student with autism in the general education classroom. Child & Family Behavior Therapy, 32(4), 311–321. https://doi.org/10.1080/07317107.2010.515527

Hanley, G. P., Iwata, B. A., & McCord, B. E. (2003). Functional analysis of problem behavior: A review. Journal of Applied Behavior Analysis, 36(2), 147–185. https://doi.org/10.1901/jaba.2003.36-147

Iwata, B. A., Kahng, S., Wallace, M. D., & Lindberg, J. S. (2000). The functional analysis model of behavioral assessment. In J. Austin & J. 376 Behav Analysis Practice (2021) 14:367–377 E. Carr (Eds.), Handbook of applied behavior analysis (pp. 61–89). Reno, NV: Context Press.

Koegel, R. L., & Koegel, L. K. (1990). Extended reductions in stereotypic behavior of students with autism through a self-management treatment package. Journal of Applied Behavior Analysis, 23(1), 119–127. https://doi.org/10.1901/jaba.1990.23-119

Langone, S. R., Luiselli, J. K., & Hamill, J. (2013). Effects of response locking and programmed stimulus control on motor stereotypy: A pilot study. Child & Family Behavior Therapy, 35(3), 249–255. https://doi.org/10.1080/07317107.2013.818906

Lanovaz, M. J., & Argumedes, M. (2010). Immediate and subsequent effects of differential reinforcement of other behavior and noncontingent matched stimulation on stereotypy. Behavioral Interventions, 25, 229–238. https://doi.org/10.1002/bin.308

Lanovaz, M. J., Robertson, K. M., Serono, K., & Watkins, N. (2013). Effects of reducing stereotypy on other behaviors: A systematic review. Research in Autism Spectrum Disorders, 7, 1234–1243. https://doi.org/10.1016/j.rasd.2013.07.009

Loftin, R. L., Odom, S. L., & Lantz, J. F. (2008). Social interaction and repetitive motor behaviors. Journal of Autism and Developmental Disorders, 38, 1124–1135. https://doi.org/10.1007/s10803-007-0499-5.

Mulligan, S., Healy, O., Lyndon, S., Moran, L., & Foody, C. (2014). An analysis of treatment efficacy for stereotyped and repetitive behaviors in autism. Review Journal of Autism and Developmental Disorders, 1, 143–164. https://doi.org/10.1007/s40489-014-0015-8.

Nuernberger, J. E., Vargo, K. K., & Ringdahl, J. E. (2013). An application of differential reinforcement of other behavior and selfmonitoring to address repetitive behavior. Journal of
Developmental and Physical Disabilities, 25, 105–117. https://doi.org/10.1007/s10882-012-9309-x.

Pierce, K. L., & Schreibman, L. (1994). Teaching daily living skills to children with autism in unsupervised settings through pictorial selfmanagement. Journal of Applied Behavior Analysis, 27(3), 471–481. https://doi.org/10.1901/jaba.1994.27-471.

Potter, J. N., Hanley, G. P., Augustine, M., Clay, C. J., & Phelps, M. C. (2013). Treating stereotypy in adolescents diagnosed with autism by refining the tactic of “using stereotypy as reinforcement.” Journal of Applied Behavior Analysis, 46(2), 407–423. https://doi.org/10.1002/jaba.52.

Rehfeldt, R. A., & Chambers, M. R. (2003). Functional analysis and treatment of verbal perseverations displayed by an adult with autism. Journal of Applied Behavior Analysis, 36(2), 259–261. https://doi.org/10.1901/jaba.2003.36-259.

Rincover, A., Cook, R., Peoples, A., & Packard, D. (1979). Sensory extinction and sensory reinforcement principles for programming multiple adaptive behavior change. Journal of Applied Behavior Analysis, 12(2), 221–233. https://doi.org/10.1901/jaba.1979.12-221.

Roane, H. S., Vollmer, T. R., Ringdahl, J. E., & Marcus, B. A. (1998). Evaluation of a brief stimulus preference assessment. Journal of Applied Behavior Analysis, 31(4), 605–620. https://doi.org/10.1901/jaba.1998.31-605.

Wilke, A. E., Tarbox, J., Dixon, D. R., Kenzer, A. L., Bishop, M. R., & Kakavand, H. (2012). Indirect functional assessment of stereotypy in children with autism spectrum disorders. Research in Autism Spectrum Disorders, 6, 824–828. https://doi.org/10.1016/j.rasd.2011.11.003

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

An Implicit Technology of Generalization

Jan 1, 2025

Trevor F. Stokes And Donald M. Baer
The University of Manitoba And The University of Kansas

Abstract

Traditionally, discrimination has been understood as an active process, and a technology of its procedures has been developed and practiced extensively. Generalization, by contrast, has been considered the natural result of failing to practice a discrimination technology adequately, and thus has remained a passive concept almost devoid of a technology. But, generalization is equally deserving of an active conceptualization and technology. This review summarizes the structure of the generalization literature and its implicit embryonic technology, categorizing studies designed to assess or program generalization according to nine general headings: Train and Hope; Sequential Modification; Introduce to Natural Maintaining Contingencies; Train Sufficient Exemplars; Train Loosely; Use Indiscriminable Contingencies; Program Common Stimuli; Mediate Generalization; and Train “To Generalize”.

DESCRIPTORS: generalization, treatment-gain durability, follow up measures, maintenance, post check methodology.

‘Preparation of this paper was supported in part by PHS Training Grant 00183, Program Project Grant HD 00870, and Research Grant MH 11739. Reprints may be obtained either from T. F. Stokes, Department of Psychology, University of Manitoba, Winnipeg, Manitoba, Canada R3T 2N2, or D. M. Baer, Department of Human Development, University of Kansas, Lawrence, Kansas 66045.

Traditionally, many theorists have considered generalization to be a passive phenomenon. Generalization was not seen as an operant response that could be programmed, but as a description of a “natural” outcome of any behavior-change process. That is, a teaching operation repeated over time and trials inevitably involves varying samples of stimuli, rather than the same set every time; in the same way, it inevitably evokes and reinforces varying samples of behavior, rather than the same set every time. As a consequence, it is predictable that newly taught responses would be controlled not only by the stimuli of the teaching program, but by others somewhat resembling those stimuli (Skinner, 1953, p. 107ff.). Similarly, responses resembling those established directly, yet not themselves actually touched by the teaching procedures, would appear as a result of the teaching (Keller and Schoenfeld, 1950, p. 168ff.). Thus, generalization was something that happened, not something produced by procedures specific to it. If generalization seemed absent or insignificant, it was simply to be assumed that the teaching process had managed to maintain unusually tight control of the stimuli and responses involved, allowing little sampling of their varieties. This assumption was strongly supported by the well-known techniques of discrimination: by differential reinforcement (in general, by any differential teaching) of certain stimuli relative to others, and/or certain responses relative to others, generalization could be programmatically restricted and diminished to a very small range. Thus, it was discrimination that was understood as an active process, and a technology of its procedures was developed and practiced extensively. But generalization was considered the natural result of failing to practice discrimination’s technology adequately, and thus remained a passive concept almost devoid of a technology. Nevertheless, in educational practice, and in the development of theories aimed at serving both practice and a better understanding of human functioning, generalization is equally as important as discrimination, and equally deserving of an active conceptualization.

Generalization has been and doubtless will remain a fundamental concern of applied behavior analysis. A therapeutic behavioral change, to be effective, often (not always) must occur over time, persons, and settings, and the effects of the change sometimes should spread to a variety of related behaviors. Even though the literature shows many instances of generalization, it is still frequently observed that when a change in behavior has been accomplished through experimental contingencies, then that change is manifest where and when those contingencies operate, and is often seen in only transitory forms in other places and at other times.

The frequent need for generalization of therapeutic behavior change is widely accepted, but it is not always realized that generalization does not automatically occur simply because a behavior change is accomplished. Thus, the need actively to program generalization, rather than passively to expect it as an outcome of certain training procedures, is a point requiring both emphasis and effective techniques (Baer, Wolf, and Risley, 1968). That such exhortations have often been made has not always ensured that researchers in the field have taken serious note of and, therefore, proceeded to analyze adequately the generalization issues of vital concern to their programs. The emphasis, refinement, and elaboration of the principles and procedures that are meant to explain and produce generalization when it does not occur “naturally” is an important area of unfinished business for applied behavior analysis.

The notion of generalization developed here is an essentially pragmatic one; it does not closely follow the traditional conceptualizations (Keller and Schoenfeld, 1950; Skinner, 1953). In many ways, this discussion will sidestep much of the controversy concerning terminology. Generalization will be considered to be the occurrence of relevant behavior under different, nontraining conditions (i.e., across subjects, settings, people, behaviors, and/or time) without the scheduling of the same events in those conditions as had been scheduled in the training conditions. Thus, generalization may be claimed when no extratraining manipulations are needed for extratraining changes; or may be claimed when some extra manipulations are necessary, but their cost or extent is clearly less than that of the direct intervention. Generalization will not be claimed when similar events are necessary for similar effects across conditions.

A technology of generalization programming is almost a reality, despite the fact that until recently, it had hardly been recognized as a problem in its own right. Within common teaching practice, there is an informal germ of a technology for generalization. Furthermore, within the practice of applied behavior analysis (especially within the past 5 yr or so), there has appeared a budding area of “generalization-promotion” techniques. The purpose of this review is to summarize the structure of that generalization literature and its implicit embryonic technology. Some 270 applied behavior analysis studies relevant to generalization in that discipline were reviewed.2 A central core of that literature, consisting of some 120 studies, contributes directly to a technology of generalization. In general, techniques designed to assess or to program generalization can be loosely categorized according to nine general headings:

1. Train and Hope
2. Sequential Modification
3. Introduce to Natural Maintaining Contingencies
4. Train Sufficient Exemplars
5. Train Loosely
6. Use Indiscriminable Contingencies
7. Program Common Stimuli
8. Mediate Generalization
9. Train “To Generalize”

“Ninety per cent of the literature reviewed was from five journals: Behaviour Research and Therapy; Behavior Therapy; Journal of Applied Behavior Analysis; Journal of Behavior Therapy and Experimental Psychiatry; and Journal of Experimental Child Psychology. Seventy-seven per cent of the literature reviewed has been published since 1970. ”

This review characterizes each category, and describes some examples of research that illustrate the generalization analyses or programming involved in each category. Obviously, all the relevant references cannot be discussed in this review.3 The nine categories listed above were induced from the literature; they are not a priori categories. Consequently, studies do not always fit neatly into these categories. It should also be noted that not all studies reviewed were thorough experimental analyses of generalization. Often inferences were necessary to categorize the research. However, the following discussion still may provide a useful organization and conceptualization of generalization and its programming.

1. Train and Hope

In applied behavior analysis research, the most frequent method of examining generalization, so far, may be labelled Train and Hope. After a behavior change is effected through manipulation of some response consequences, any existent generalization across responses, settings, experimenters, and time, is concurrently and/or subsequently documented or noted, but not actively pursued. It is usually hoped that some generalization may occur, which will be welcomed yet not explicitly programmed. These hopeful probes for stimulus and response generalization characterize almost half of the applied literature on generalization. The studies have considerable importance, for they begin to document the extent and limits of generalization of particular operant intervention techniques. While not being examples of the programming of generalization, they are a sound first step in any serious analysis of generalization. When generalization is desired, but is shown to be absent or deficient, programming procedures can then be instituted.

For example, useful generalization across settings was documented by Kifer, Lewis, Green, and Phillips (1974). In an experimental classroom setting, parent-child pairs were taught to
negotiate in conflict situations. During simulated role-playing, instructions, practice, and feedback were used to teach the negotiation behaviors of fully stating one’s position, identifying the issues of conflict, and suggesting options to resolve the conflict. The data showed increased use of negotiation behaviors and the reaching of agreements in actual parent-child conflict situations at home.

An assessment of generalization across experimenters was described by Redd and Birnbrauer (1969), who demonstrated that control over the cooperative play of retarded children did not generalize from an adult who dispensed contingent edible reinforcement to five other adults who had not participated in training.

Studies that are examples of Train and Hope across time are those in which there was a change from the intervention procedures, either to a less intensive but procedurally different program, or to no program or no specifically defined program. Data or anecdotal observations were reported concerning the maintenance of the original behavior change over the specified time intervening between the termination of the formal program and the postchecks. An example of a followup evaluation was the study by Azrin, Sneed, and Foxx (1973). An intensive training program involving reinforcement of correct toileting and positive practice procedures promptly decreased bedwetting by 12 retarded persons. The reduced rate of accidents was maintained during a three-month followup assessment.

Perhaps there are many more studies in the Train and Hope category than would have been expected (about 135, of which 65 % are across Time). However, despite its obvious value, this research is frequently characterized by a lack of comprehensiveness and depth of the generalization analysis. Even though generalized behavior change was frequently reported, extensive, wideranging, and practical generalization was not often noted or even sought. The continued development of behavior analysis almost surely will demand more extensive collection of generalization data than is presently the fashion. The extent and limits of applied behavioral interventions may be well documented and understood if measurement is extended over longer periods of time, over more than one circumscribed part of the day, with more than one related response, and with more than a restricted part of the social and physical environment. It is as important for the field to formalize the conditions of the nonoccurrence of generalization as it is to document the conditions associated with the display of unprogrammed generalization.

Most of the Train-and-Hope research described successful generalization-approximately 90 % of Train-and-Hope studies. By definition, there was no further need to program generalization in those studies where generalization had been exhibited within the Train-and-Hope paradigm-presuming, of course, that the generalization exhibited was considered sufficient to meet the therapeutic goals of the various modification programs (not necessarily a valid presumption in the Train-and-Hope research). This preponderance of positive data may simply reflect the tendency of some researchers not to report their generalization data if measurement procedures were instituted to probe for any generalized behavior changes, but generalization was shown to be absent. Some researchers may view nongeneralization as reflecting a deficiency or ineffectiveness of their procedures to develop a desirable generalized performance. Behavior analysts, nevertheless, should encourage their fellow researchers to document and to analyze experimentally their apparent failures, rather than allowing them to slide into oblivion. A detailed and systematic understanding of generalization and its programming could result. Alternatively, researchers might view their generalization baselines as being essentially independent of the modified baseline; thus, to report nongeneralization would serve no useful purpose, for its nonoccurrence would be expected. Again, any such documentation contributes to our understanding of the extent and limits of generalization, as well as serving as an indication of the frequent necessity of generalization-programming techniques.

There is another reason for the predominance of positive results in this section: if nongeneralization was clearly evident, and the modification of this state was important, then a form of limited programming was frequently instituted. Examples of this research will be discussed in the next category, “Sequential Modification”.

2. Sequential Modification

These studies exemplify a more systematic approach to generalization than the Train-andHope research. Again, a particular behavior change is effected, and generalization is assessed. But then, if generalization is absent or deficient, procedures are initiated to accomplish the desired changes by systematic sequential modification in every nongeneralized condition, i.e., across responses, subjects, settings, or experimenters. The possibility of unprogrammed generalization typically was not examined in these sequential modification studies, because after the initial demonstration of nongeneralization, all other baselines were exhausted. That is, after changes had been produced directly in all baselines, generalization to nonrecorded responses, subjects, settings, and experimenters may have occurred, but could not be examined.

For example, Meichenbaum, Bowers, and Ross (1968) reported an absence of generalization of behavior changes from an afternoon intervention period to the morning period in a classroom for institutionalized female adolescent offenders. Money dispensed contingent on on-task behaviors effected desired behavior changes during the afternoon, but generalization to the morning period required that the same manipulations be applied there as well (sequential modification across settings). Similarly, generalization across settings of the disruptive and oppositional behavior of two children was investigated by Wahler (1969). He demonstrated control of these behaviors in the home by using differential attention and timeout operations. When generalization to the children’s school behavior was not evidenced, similar contingency operations were employed to accomplish changes in that setting as well.

The category of Sequential Modification characterizes much of the actual practice of many behavior analysts. Sequential modification is merely a systematized experimental procedure that formalizes and allows evaluation of these typical therapeutic endeavors. The tactic of scheduling behavior-change programs in every condition to which generalization is desired is frequently employed. The rationale for these procedures is as follows. If a desired generalization is not likely to be exhibited after changing a behavior in a particular condition, or a number of conditions, e.g., settings, then the researcher or practitioner works to effect changes across conditions as a matter of course, rather than as an outcome of the display or nondisplay of generalization. Thus, a behavior analyst is likely to advise the scheduling of consequences in every relevant condition in preference to the dispensing of consequences in only one or a few conditions, while hoping for generalization, but likely not seeing it.

3. Introduce to Natural Maintaining Contingencies

Perhaps the most dependable of all generalization programming mechanisms is one that hardly deserves the name: the transfer of behavioral control from the teacher-experimenter to stable, natural contingencies that can be trusted to operate in the environment to which the subject will return, or already occupies. To a considerable extent, this goal is accomplished by choosing behaviors to teach that normally will meet maintaining reinforcement after the teaching (Ayllon and Azrin, 1968).

Baer and Wolf (1970) reported a study by Ingram that illustrated the mechanism of “trapping”, where a preschool child was taught an entry response that exposed the child to the natural contingencies of peers in the preschool environment. Preschool teachers modified the low rate of skillful interaction of the child by priming others to interact with the subject and reinforcing appropriate interactions. The data showed that over time the teachers lost control of the interaction behavior, which remained high; it was assumed that the group’s natural consequences for interaction had taken control of the subject’s behavior. Thus, to program generalization, the child perhaps needed only to be introduced adequately to the natural reinforcers inherent in active preschool play and interaction. Some early analyses of preschool children’s behavior have stressed that if the child can be so introduced (through the operation of differential attention from teachers) to a reinforcing preschool natural environment, then the behaviors eventually do not need to be maintained by continued contrived modification of the environment. For example, Hall and Broden (1967) modified the manipulative play, climbing, and social interaction of three subjects through social reinforcement operations. Behavior changes were shown to be durable and successful followup data at three months were described.

Buell, Stoddard, Harris, and Baer (1968) demonstrated the collateral development of appropriate social behavior (e.g., touching, verbalizing, and playing with other children) accompanying the reinforcement of increased use of outdoor play equipment by a 3-yr-old girl. This entry response to the natural reinforcement community was tactically sound because the child’s motor behavior was modified in a setting where the resulting behavior would tend automatically to increase social contact with other children, and this natural social environment could maintain the child’s new skills, but indeed may also be expected to sharpen and refine them, and add entirely new ones as well.

Most of the research concerning natural maintaining contingencies has involved children, perhaps because such techniques seem particularly suitable, especially to their social behavior. Research would profit by determining what natural reinforcement communities exist for various behaviors and subjects, and what economical means may be employed to ensure entry to these behavioral traps.

Unfortunately, in some instances there may be no natural reinforcement operating to develop and maintain skills. For example, in the case of retarded and institutionalized persons whose dependency has become a stable fact in the lives of their caretakers, some re-arrangement of the natural environment may be necessary. A few studies have introduced subjects to semicontrived
or redesigned “natural” reinforcement communities. A simple but meaningful example was provided by Horner (1971), who taught a 5-yr-old institutionalized retarded boy to walk on crutches in an experimental setting. The child was then prompted to generalize the new walking skill to other settings and activities to which he previously had been taken in a wheelchair by solicitous caretakers, by enlisting those caretakers to refrain from offering this help. Within 15 days after treatment was concluded, the child walked on crutches to all those activities and settings, eventually extending his ambulation skills to any part of his world. Stolz and Wolf (1969) trained a 16-yr-old, “blind” retarded male to discriminate visual stimuli. Then, the environment was so structured that assistance was not given in situations where it had previously been given as a matter of course. When the boy was required to use visual cues to help himself in a cafeteria line, he soon emitted the necessary behaviors. However, these studies did not establish the functionality of their procedures in the maintenance of behavior changes.

Another significant example was provided by Seymour and Stokes (1976). In their study, institutionalized delinquent girls were taught to solicit reinforcement (cf. Graubard, Rosenberg, and Miller, 1971) from their natural community, the staff of their residential institution. In their case, the staff had rarely displayed any systematic attempts at reinforcing desirable behavior shown
by the girls, perhaps on the presumption that the girls were “bad” and not reinforcible in any case. However, the experimenters were able to teach the girls that when their work was objectively
good, and when staff persons were nearby, a simple skill of calling these adults’ attention to their good work would result in fairly consistent reinforcement. Thus, this was a case in which experimental reinforcement was used to develop a response in the subjects that would tap and cultivate the available but dormant natural community. In theory, this new skill should have obviated the need for further experimental reinforcement, for the praise evoked should have functioned to maintain both the girls’ work and cueing, and the cueing, in turn, should have functioned to maintain staff praise. The Seymour and Stokes’ study could not be continued long enough to establish whether this would happen, and so it remains a logically appealing but still unexplored method of enhancing generalization: teaching the subject a means of recruiting a natural community of reinforcement to maintain that generalization. Perhaps an even greater advantage of such procedures is a change in the locus of control: the subjects can become more prominent agents of their own behavior change, rather than being hapless pawns of more-or-less
random environmental contingencies.

Restructuring the environment thus becomes a target of research aimed at extending the generalization of newly taught skills; even though, at a technical level, this operation may not be considered generalization, but rather transfer of control from one reinforcement contingency to another. In any event, it is a much neglected topic of experimental research, although widely recognized as a desirable, and even essential characteristic of any rehabilitative effort.

Some natural contingencies are inevitably at work contributing to the maintenance of inappropriate behavior. For example, peer-group control of inappropriate behavior has often been suspected and sometimes documented (Buehler, Patterson, and Furniss, 1966; Gelfand, Gelfand, and Dobson, 1967; Solomon and Wahler, 1973). It would seem reasonable, then, that if the pattern of reinforcement of inappropriate behavior is modified, the observed outcome may erroneously, but happily be attributed to generalization. For example, Bolstad and Johnson (1972) presented data that showed that both experimental and control subjects in the same classroom were all affected (although not to the same extent) by experimental manipulation of the reinforcement contingency for the experimental subjects, whereas control subjects in a different classroom were not so affected. The authors presented data that may account for these differences. The control subjects in the experimental classroom, who were also disruptive students, had fewer disruptive interactions with the experimental subjects during the treatment phases than during baseline. This possible generalization effect may be due to the disruption of the natural contingencies operating in that environment. That is, other disruptive students previously supported some of the disruptive behavior of the control subjects, but during treatment these experimental subjects did not support the disruptive behavior of their peers and, thus, a “generalized” decrease in disruptive behavior by the control subjects resulted.

4. Train Sufficient Exemplars

If the result of teaching one exemplar of a generalizable lesson is merely the mastery of the exemplar taught, with no generalization beyond it, then the obvious route to generalization is to teach another exemplar of the same generalization lesson, and then another, and then another, and so on until the induction is formed (i.e., until generalization occurs sufficiently to satisfy the problem posed). Examples of such programming techniques will be described in this category of training sufficient exemplars, perhaps one of the most valuable areas of programming. Certainly it is the generalization-programming area most prominent and extensive in the present literature. In the research discussed previously under the categories of Train and Hope and Sequential Modification, the typical analysis of generalization concerned the measurement of generalization to only a few (and often only one) extraexperimental responses, subjects, settings, experimenters, or times. When the absence of generalization was noted, sometimes it was accomplished by further direct intervention in every nongeneralized condition (i.e., Sequential Modification). Having completed such modifications, the possibility of more extensive generalized effects (i.e., beyond the two or three modified baselines) was not examined. In the training of sufficient exemplars, generalization to untrained stimulus conditions and to untrained responses is programmed by the training of sufficient exemplars (rather than all) of these stimulus conditions or responses.

A systematic demonstration of programmed generalization and measurement of generalized effects beyond intervention conditions was reported by Stokes, Baer, and Jackson (1974). They established that training and maintenance of retarded childrens’ greeting responses by one experimenter was not usually sufficient for the generalization of the response across experimenters. However, high levels of generalization to over 20 members of the institution staff (and newcomers as well) who had not participated in the training of the response were recorded, after a second experimenter trained and maintained the response in conjunction with the first experimenter. Thus, when generalization did not prevail after the training of one stimulus exemplar, it was programmed by training a greater diversity of stimulus (trainer) conditions. Similarly, Garcia (1974) taught a conversational speech form to two retarded children, and, upon discovering a lack of stable generalization across experimenters after one training input, programmed generalization across experimenters by having a second experimenter teach the same responses.

A sufficient-stimulus-exemplars demonstration of programmed generalization across settings has been described by Allen (1973). Allen modified the bizarre verbalizations of an 8-yr-old boy by differential attention procedures. Ignoring bizarre verbalizations and praise for appropriate interaction reduced bizarre verbalizations during evening camp activities. However, there was no generalization to three other camp settings. After additional training in a second setting, some generalization to the unmanipulated settings was noted. This generalization was further enhanced by intervention in the third setting. Unfortunately, the experimental procedures did not allow sufficient time to document the full extent of generalization after training in two settings, but generalization after training in two settings was clearly evident. Griffiths and Craighead (1972) similarly programmed generalization across settings. A 30-yr-old retarded woman received praise and tokens for correct articulation in speech therapy. Generalization to a residential cottage was not observed until the same procedures were instituted there. Following training in these two stimulus exemplars, generalization to a third nontraining setting (a classroom) was observed.

Very little research concerned with generalization programming has dealt with the training of sufficient stimulus exemplars. The infrequent research that has been published is characterized largely by programming across experimenters. This work has been promising, for after a modest number of training inputs, generalization apparently will occur with persons not involved in training-unquestionably a valuable and inexpensive outcome. However, the present implication of these studies is limited because of the restricted nature of the type of subjects and responses analyzed. Further work is also needed to give direction to the optimal conditions whereby the most extensive generalization will be achieved with a minimal training expenditure. Nevertheless, it is optimistic to note how frequently a sufficient number of exemplars is a small number of exemplars. Frequently, it is no more than two. In particular, there may well be reason to suspect that the use of two trainers will yield excellent results in terms of generalization. This possibility, obviously an economical one, certainly merits systematic study of its potential and limits.

Although very little research has been reported, the analysis of generalization programming by training in a number of settings is a virtually untapped area of far-reaching value. However, consistent optimism should follow examination of the studies showing generalization after training in only a few settings. Unfortunately, behavior analysts seem too often satisfied with the modification of a single, well-defined behavior in one setting, e.g., a laboratory preschool. Discriminated programs are often acceptable, and sometimes even desirable. When generalization is a valid concern, but researchers and practitioners do not act as if this were so, the discriminated behavior of researchers is most probably inhibitory to the development of an effective generalization technology.

Over the past 10 yr, there has developed an extensive literature discussing the programmed generalization of responses through the training of sufficient response exemplars. A response class has been operationally defined to describe the fact that some responses are organized such that operations applied to a subset of responses in the class affect the other members of that class in the same manner. For example, Baer, Peterson, and Sherman (1967) reinforced various motor imitations by retarded children. They found that as long as reinforcement followed some imitative
responses, other imitations continued to be performed without training or reinforcement.

A topographical analysis of generalized imitation has been made by Garcia, Baer, and Firestone ( 1971). Four retarded children were trained to imitate three different topographical types of response: small motor, large motor, and short vocal. These subjects were also probed for their imitation of other unreinforced responses: short motor, long motor, short vocal, and long vocal. Generalized imitation was observed with each subject, but this generalization reflected the particular dimensions of the topographical response currently being trained or having previously received training. Thus, generalization may occur within well-defined classes and may not generalize to other classes unless some special training (generalization programming) occurs within that
class as well. These data depict one possible limitation of the generality of generalized imitation, as well as pointing to the need to train response exemplars that will adequately reflect the diversity of the generalization being programmed.

Children’s grammatical development has been another prominent area of research dealing with generalized behavior. The concept of response class is again pivotal in these studies, which conceptualize the rules of morphological grammar as equivalent to response class phenomena. For example, Guess, Sailor, Rutherford, and Baer (1968) developed the generative correct use of plurals by a retarded girl. After teaching a number of exemplars of the correct plural response, the girl appropriately labelled new objects in the singular or plural without further direct training relevant to those objects. Plural usage had become a generalized response class; the morphological rule had been established. Schumaker and Sherman (1970) rewarded three retarded children for the correct production of past- and present-tense forms of verbs. As pastand present-tense forms of verbs within an inflectional class were modified, there occurred a generalized usage of untrained verbs to similar tense forms.

There has been considerable research to establish the importance of the training of sufficient response exemplars. A survey of these (approximately 60) studies shows that the number of exemplars found to be “sufficient” for a desirable level and durability of generalization varies widely, probably determined primarily by the nature of the task and the subject’s prior skills relevant to it. Most of this research was concerned with the development of motor and vocal imitations, and the beginning development of grammar and syntax. The development of question-asking and instruction-following is also well represented.

In conclusion, examination of the sufficient exemplar research points to a significant (and long-familiar) generalization-programming procedure: a number of stimulus and/or response exemplars should undergo training. That is, to program the generalized performance of certain responses across various setting conditions or persons, training should occur across a (sufficient) number of setting conditions and/or with various persons. In a similar manner, generalization across responses can be programmed reliably by the training of a number of responses. Diversity of exemplars seems to be the rule to follow in pursuit of the maximum generalization. Sufficient diversity to reflect the dimensions of the desired generalization is a useful tactic. However, diversity may also be our greatest enemy: too much diversity of exemplars and not enough (sufficient) exemplars of similar responses may make potential gains disproportional to the investment of training effort. The optimal combination of sufficient exemplars and sufficient diversity to yield the most valuable generalization is critically in need of analysis. Is the best procedure to train many exemplars with little diversity at the outset, and then expand the diversity to include dimensions of the desired generalization? Or is it a more productive endeavor to train fewer exemplars that represent a greater diversity, and persist in the training until generalization emerges’?

5. Train Loosely

One relatively simple technique can be conceptualized as merely the negation of discrimination technique. That is, teaching is conducted with relatively little control over the stimuli presented and the correct responses allowed, so as to maximize sampling of relevant dimensions for transfer to other situations and other forms of the behavior. A formal example of this most often informal technique was provided by Schroeder and Baer (1972), who taught vocal imitation skills to retarded children in both of two ways, one emphasizing tight restriction of the vocal skills being learned at the moment (serial training of vocal imitations), and the other allowing much greater range of stimuli within the current problem (concurrent training of imitations). The latter method was characterized repeatedly by greater generalization to as-yet-untaught vocal imitation problems, thus affirming “loose” teaching techniques as a contributor to wider generalization.

It will be appreciated that the literature of the field contains very few examples of this type. Researchers always have attempted to maintain thorough control and careful restriction and standardization of their teaching procedures, primarily to allow easy subsequent interpretation of the nature of their (successful) teaching techniques. Yet the import of this technique is that
careful management of teaching techniques to a precisely repetitive handful of stimuli or formats may, in fact, correspondingly restrict generalization of the lessons being learned. The ultimate force of this recommendation remains to be seen. What seems required is programmatic research aimed at assessing the generalization characteristics of lessons taught under careful, restricted
conditions, relative to similar lessons taught under looser, more variable conditions.

6. Use Indiscriminable Contingencies

Intermittent schedules of reinforcement have been shown repeatedly to be particularly resistant to extinction, relative to continuous schedules (Ferster and Skinner, 1957). Resistance to extinction may be regarded as a form of generalization-generalization across time subsequent to learning. The essential feature of intermittent schedules may be their unpredictability-the impossibility of discriminating reinforcement occasions from nonreinforcement occasions until after the fact. Thus, if contingencies of reinforcement or punishment, or the setting events that mark the presence or absence of those contingencies, are made indiscriminable, then generalization may well be observed.

In generalization, behavior occurs in settings in which it will not be reinforced, just as it does in settings in which it will be reinforced. Then, the analogue to an intermittent schedule, extended
to settings, is a condition in which the subject cannot discriminate in which settings a response will be reinforced or not reinforced. A potential approximation to such a condition was presented
in a study by Schwarz and Hawkins (1970). In that experiment, the behavior of a sixth-grade child was videotaped during math and spelling classes. Later, after each school day had ended, the child was shown the tape of the math class and awarded reinforcers according to how often good posture, absence of face-touching, and appropriate voice-loudness were evident on that tape. Although reinforcers were awarded only on the basis of behaviors displayed during the math class, desirable improvements were observed during the spelling class as well. In that reinforcement was delayed, this technique must have made it difficult for the child to discriminate in which class the behaviors were critical for earning reinforcement. In other words, the generalized success of the study may well be attributable to the partly indiscriminable nature of the reinforcement contingency.

In general, it may be suspected that delayed reinforcement often will have the advantage of making the times and places in which the contingency actually operates indiscriminable to the subject. However, this advantage is an advantage, by hypothesis, primarily for the goal of generalization. Otherwise, delayed reinforcement would often be considered an inefficient technique, most especially so for the initial development of a new skill. Indeed, it may be exactly in the realm of disadvantaged persons such as retarded children that the usual inefficiency of delayed reinforcement may seem the most severe handicap to its use. However, its potential for fostering generalization suggests strongly that further research be invested in this procedure (and any others that make reinforcement contingencies properly indiscriminable), to develop methods of applying it perhaps only after the initial development of a new skill, in the interests of promoting generalization.

Less than a dozen studies of generalization interpretable as cases of indiscriminable reinforcement contingencies can be found in the literature. Kazdin (1973), for example, showed that teacher attention to one retarded child was responded to by another child as if it were reinforcement for on-task behavior. Indeed, the onlooker reacted with increased on-task behavior, even when the teacher attended to the target child’s off-task behavior. Possibly, prior experience with reinforcement contingent on the peers’ on-task behavior was sufficient to make all future praise (contingent or not) discriminative for ontask behavior. In other words, with sufficient prior experience, the onlooker may have stopped observing the contingency in which the reinforcement operated and responded only to the reinforcing stimulus’ presence, making the contingency functionally indiscriminable.

Generalization across subjects has similarly been reported by Broden, Bruce, Mitchell, Carter, and Hall (1970) in a classroom of culturally disadvantaged children. When positive teacher attention was given for one child’s attention to academic work, the attending of a peer also increased. This generalization was also a probable function of the cueing properties of teacher reinforcement. However, the generalization observed may also have been due to the manipulation of natural social consequences received by the nontarget child through peer attention, or may have been caused by a slight increase in the amount of teacher attention to the nontarget child. These effects deserve further systematic evaluation because of their relevance to the classroom practices of many teachers who strive to instruct effectively but are unable to devote extensive time to individual children.

Pendergrass (1972) showed that timeout could be employed to decrease the destructive behavior of two retarded children. With one subject, decreased rates were also observed with another response (self-biting) which was sometimes chained to the destructive behavior, but not itself subjected to contingent timeout. However, with the second subject, generalization to a second response (autistic jerking movement) was not observed. Analysis of the data revealed that the two behaviors occurred simultaneously more frequently with the subject with whom generalization was evidenced. Thus, with this subject, punishment of the generalization response occurred more frequently when destructive behavior was punished. Unfortunately, it was not determined how often the self-biting occurred at times not simultaneous with the destructive behavior. Therefore, the schedule of punishment for self-biting was not established, i.e., whether biting occurred only when destructive behavior occurred and, therefore, always met the timeout contingency. In this example (which was not intended to be a careful analysis of the indiscriminable reinforcement concept), not only was the reinforcement contingency somewhat difficult to discriminate, but the two behaviors (destructive and self-destructive responses) also may well have been only somewhat differentiated by the subject.

Thus, preventing the ready discrimination of contingencies is a generalization-programming technique worthy of application and research. Perhaps a random or haphazard delivery of reinforcement will (if luck or good judgement prevails) function to modify targetted behavior as well as behavior occurring in proximal time or space. Even noncontingent reinforcement, delivered at the outset of an intervention program, may retard initial effects, but may work to later advantage in generalization outcomes.

Finally, Kazdin and Polster (1973) showed once again the usefulness of intermittent schedules to delay subsequent extinction, relative to continuous schedules of reinforcement. Social interaction by two retardates was reinforced with tokens. After establishing social interaction, one subject received continuous reinforcement and the other, intermittent reinforcement. During extinction, only the subject who received intermittent reinforcement continued to interact socially with peers. However, these results may simply reflect different extinction rates by two subjects. The research was essentially a group study where N 1. Adequate single-subject experimental control was lacking. Therefore, replication of these procedures would be desirable.

7. Program Common Stimuli

The passive approach to generalization described earlier need not be a completely impractical one. If it is supposed that generalization will occur, if only there are sufficient stimulus components occurring in common in both the training and generalization settings, then a reasonably practical technique is to guarantee that common and salient stimuli will be present in both. One predictor of the salience of a stimulus to be chosen for this role is its already established function for other important behaviors of the subject.

Children’s peers may represent peculiarly suitable candidates for a stimulus common to both training and generalization settings. An example has been provided by Stokes and Baer (1976). In their study, two children exhibiting serious learning disabilities were recruited to learn several word-recognition skills. One child was taught these skills and concurrently shown how to teach them to the other child, thus acting as a peer-tutor. It was found that both children reliably learned the skills, but that neither generalized them reliably or stably to somewhat different settings in which the other child usually was absent. However, when the peer-tutor was brought into those settings, then each child similarly showed greatly increased and stabilized generalization, even though there were never any consequences for generalization. Similar demonstrations have been provided by Johnston and Johnston (1972) for the skill of speech articulation. In that study, peers were rewarded for correct monitoring of the subjects’ articulation. Generalization of correct articulation occurred only when the “monitoring” peer was present. Unfortunately, it was not determined clearly whether generalization was evidenced because of the discriminative properties of the peers’ presence in both settings, or whether the peers actively continued their monitoring in the generalization setting.

Rincover and Koegel (1975) have also incorporated functional training stimuli into the generalization setting. Autistic children were rewarded for imitation and instruction-following in a training setting. Four of their 10 subjects then did not exhibit generalization to a different setting. Therefore, to program for this generalization, various aspects of the training procedures (e.g., hand movement by therapist) or physical training environment (e.g., table and chairs) were systematically introduced to the generalization setting to control generalization. Making the experimental setting more closely resemble the regular classroom (generalization setting) was the programming procedure employed by Koegel and Rincover (1974). They decreased the teacher-to-student ratio in the experimental setting from 1-to-i to 1-to-8. After these special programming conditions were instituted, there was increased performance on previously learned and new behaviors learned in the classroom. Walker and Buckley (1972) programmed generalization of the effects of remedial training of social and academic classroom behavior by establishing common stimuli between the experimental remedial classroom and the childrens’ regular classroom by using the same academic materials in both classrooms.

The literature of this field shows only a handful of studies deliberately making use of a common stimulus in both training and generalization settings. Obviously, this is a technological dimension urgently in need of thorough development. The use of peers as the common stimulus has much to recommend it as a practical and natural technique. To what extent peers need to participate in the training setting has not yet been determined, although the absence of generalization sometimes shown when peers are present in nontraining settings, suggests that peers not involved in a training setting will not likely acquire sufficient discriminative function to control generalized responding. The use of common physical stimuli is in even greater need of systematic research. A common stimulus approach to generalization would encourage the incorporation into training settings of (naturally occurring) physical stimuli that are frequently promnent or functional in nontraining environments. If these stimuli are well chosen, and can be made functional and salient in the training procedures, then generalization may thereby be programmed.

8. Mediate Generalization

Mediated generalization is well known as a theoretical mechanism explaining generalization of highly symbolic learnings (Cofer and Foley, 1942). In essence, it requires establishing a response as part of the new learning that is likely to be utilized in other problems as well, and will constitute sufficient commonality between the original learning and the new problem to result in generalization. The most commonly used mediator is language, apparently. However, the deliberate application of language to accomplish generalization is rare in the literature reviewed, and correspondingly little is known about what aspects of a language response make for best mediation.

A sophisticated analysis of mediated generalization was conducted by Risley and Hart ( 1968), who taught preschool children to report at the end of play on their play-material choices. Mention of a given choice was reinforced with snacks, which produced increased mentioning of that choice, but no change in the children’s actual use of that play-material. When reinforcement was restricted to true reports of play-material choices, however, the children then changed their play behavior (the next day) so that when queried about that play, they could truthfully report on their use of the specified play material and earn reinforcement. Control over any choice of play materials proved possible with this technique, which placed teaching contingencies not on the play, but on a potential mediator (verbal report) of that play behavior. That the reports were only potential mediators was apparent in the early stages of the study, when the children readily reported (untruly) their use of play materials with no corresponding actual behavior with those materials; at that stage, they earned reinforcement even so. When the reinforcement active generalization approach recommended earlier.

The mediation of generalization is also exemplified in the behavior analysis research of selfcontrol and self-management procedures. That is, self-control procedures such as self-recording,
taught as part of an intervention program, may function to promote generalization: such techniques are easy to transport and may be employed readily to facilitate responding under generalization conditions. Some research that has employed any or all of the various tactics of selfassessment, self-recording, self-determination of reinforcement, and/or self-administration of reinforcement (Glynn, Thomas, and Shee, 1973), has also displayed maintenance and generalization of behavior change; however, the correlation is not perfect.

Broden, Hall, and Mitts (1971) reported that after an eighth-grade girl experienced selfrecording of study behavior and teacher praise for improved study, her study behavior maintained at a high level for a recorded three weeks. Although the individual effects of the self-recording and praise were not determined, it is possible that the self-recording procedures contributed significantly to this generalization.

Drabman, Spitalnik, and O’Leary (1973) taught disruptive children to match their teacher’s evaluations of their appropriate classroom behavior. Tokens were dispensed for appropriate classroom behavior and accurate matching. Disruptive classroom behavior decreased and was maintained at low levels during a 12-day phase when tokens were not dispensed for self-recording accuracy. Generalized behavior improvement was also evident during a 15-min no-token period within the experimental hour. These changes were possibly a function of the close temporal proximity of the token periods, which frequently immediately preceded or followed the generalization period.

The role of self-control procedures in mediating generalization has often been proposed. Research would do well to examine the contribution of self-control tactics in generalization and maintenance, especially when formal intervention manipulations have ceased to operate. The effects of accompanying procedures should be experimentally separated from self-control effects, and the role of each of the various selfcontrol tactics (Glynn et al., 1973) should be individually analyzed. The potential of selfmediated generalization is apparent, but its implications and practical utility still remain to be assessed.

9. Train “To Generalize”

If generalization is considered as a response itself, then a reinforcement contingency may be placed on it, the same as with any other operant. Informally, teachers often do this when they urge a student who has been taught one example of a general principle to “see” another example as “the same thing”. (In principle, they are also attempting to make use of language as a mediator of generalization, relying on the supposed characteristics of words like “same” to accomplish the generalization.) Common observation suggests that the method often fails, and that when it does succeed, little extrinsic reinforcement is offered as a consequence. A more formal example of the technique was seen in a study by Goetz and Baer (1973), in which three preschool children were taught to generalize the response of making block forms (in blockbuilding play). Descriptive social reinforcement was offered only for every different form the child made, i.e., contingent on every first appearance of any blockbuilding form within a session, but not for any subsequent appearances of that form. Thus, the child was rewarded for moving along the generalization gradient underlying blockform inventions, and never for staying at any one point. In general, the technique succeeded, in that the children steadily invented new block forms while this contingency was in use. Thus, there exists the possibility of programming reinforcement specifically, perhaps only, for movement along the generalization gradient desired.

In largely unspecified ways, perhaps two other studies exemplify this logic. Herbert and Baer (1972), for example, taught two mothers of deviant children to give social reinforcement only to their children’s appropriate behaviors, but taught the mothers from the outset to judge all behavior according to criteria they helped to develop, rather than attack only a few specified child responses. These mothers learned a generalized skill because they applied correct social contingencies to categories that included virtually all appropriate child behavior likely to occur. Behavior changes were maintained at 20 and 24 weeks after completion of formal training. Similarly, Parsonson, Baer, and Baer (1974) taught two teachers of retarded children to apply generalized correct social contingencies to all likely appropriate and inappropriate behaviors of preschool retarded children. These effects were also durable over several months. Apparently generalized changes were produced in these studies by Herbert and Baer and Parsonson et al., but the extent and quality of that generalization was not quantified as such.

Very few studies of this type are found in the literature of applied behavior analysis, probably because of the preference of behaviorists to consider generalization as an outcome of behavioral change, rather than as a behavior itself. Ultimately, this behavioristic stance may well prove durable and consistent. Meanwhile, it is worth hypothesizing that “to generalize” may be treated as if it were an operant response, and reinforced as such, simply to see what useful results occur.

Consequently, one other technique deserves discussion: the systematic use of instructions to facilitate generalization. Thus, if a behavior is taught and generalization is not displayed, the least expensive of all techniques is to tell the subject about the possibility of generalization and then ask for it. If that generalization then occurs, it may well be referred to as “instructed generalization”. If the effects of that instruction are themselves to become generalized (yielding a “generalized generalizer”?), then reinforcement of the generalized behavior, on a suitable schedule, might well be prudent, at least at first. Perhaps it is simply a very elaborate version of this technique that is being practiced when a client is taught to relax in a somewhat anxiety- rousing situation, and reinforced (socially) for doing so; and then is instructed to relax in a somewhat more powerful anxiety-arousing situation, etc. That is, systematic desensitization to a heirarchy of stimuli may be analyzed as reinforcing not just relaxation, but also generalization along an already constructed generalization gradient (cf. Yates, 1970, p. 64ff.).

Conclusion

The structure of the generalization literature and its implicit embryonic technology has been summarized. The most frequent treatments of generalization are also the least analytical-those described as Train and Hope and Sequential Modification. Included in the category of Train and Hope were those studies where the potential for generalization had been recognized, its presence or absence noted, but no particular effort was expended to accomplish generalization. By contrast, some limited programming was implemented in the Sequential Modification research. In these studies, given an absence of reliable generalization, procedures to effect changes were instituted directly in every nongeneralized condition. Although contributing significantly to our understanding of the generalization of behaviorchange programs, these studies are not examples of the programming of generalization.

Seven categories were discussed that directly relate to a technology of generalization. First, the potential role of Natural Maintaining Contingencies was discussed. According to this tactic, generalization may be programmed by suitable trapping manipulations, where responses are introduced to natural reinforcement communities that refine and maintain those skills without further therapeutic intervention. The Training of Sufficient Exemplars is numerically the most extensive area of programming: generalization to untrained stimulus conditions and to untrained responses is programmed by the training of sufficient exemplars of those stimulus conditions or responses. Train Loosely is a programming technique in which training is conducted with relatively little control over the stimuli and responses involved, and generalization is thereby enhanced. To invoke the tactic of Indiscriminable Contingencies, the contingencies of reinforcement or punishment, or the setting events marking the presence or absence of those contingencies, are deliberately made less predictable, so that it becomes difficult to discriminate reinforcement occasions from nonreinforcement occasions. Common Stimuli may be employed in generalization programming by incorporating into training settings social and physical stimuli that are salient in generalization settings, and that can be made to assume functional or obvious roles in the training setting. Mediated Generalization requires establishing a response as part of new learning that is likely to be utilized in other problems as well, and thus result in generalization. The final technique, Train “To Generalize”, involves reinforcing generalization itself as if it were an explicit behavior. These programming techniques should be researched further and usefully applied in programs in which generalization is relevant.

This list of generalized tactics conceals within itself a much smaller list of specific tactics. These specific tactics can be presented as a small picture of the generalization technology in its present most pragmatic form, not only to offer a set of what-to-do possibilities, but also to emphasize how very small the current technology is and how much development it requires:

1. Look for a response that enters a natural community; in particular, teach subjects to cue their potential natural communities to reinforce their desirable behaviors.
2. Keep training more exemplars; in particular, diversify them.
3. Loosen experimental control over the stimuli and responses involved in training; in particular, train different examples concurrently, and vary instructions, SDs, social reinforcers, and backup reinforcers.
4. Make unclear the limits of training contingencies; in particular, conceal, when possible, the point at which those contingencies stop operating, possibly by delayed reinforcement.
5. Use stimuli that are likely to be found in generalization settings in training settings as well; in particular, use peers as tutors.
6. Reinforce accurate self-reports of desirable behavior; apply self-recording and selfreinforcement techniques whenever possible.
7. When generalizations occur, reinforce at least some of them at least sometimes, as if “to generalize” were an operant response class.

There are many examples of generalization and nongeneralization of behavior changes. The fact that apparently unprogrammed generalization has been demonstrated (particularly across time) is valuable. It heralds a practicality desirable in any technology of behavior: that every one of a subjects’ responses, in every setting, with every experimenter, and at every conceivable time does not need to meet specific treatment consequences for that program to accomplish and maintain important behavior changes. Alternatively, the fact that generalization is not always observed and durability is not inevitable means that there is hope for behavior modification: behavior can always be modified and changes are not necessarily irreversible. That is, once behavior has been modified, there is still the possibility of reconditioning if changes are undesirable or inappropriate, or if new inappropriate behaviors develop. If both appropriate and inappropriate behavior changes were to persist and prove irreversible, it would presage the demise of any technology of behavioral intervention. This occurrence of nongeneralization also underlines the need to develop a technology of generalization, so that programming will be a fundamental component of any procedures when durability and generalization of behavior changes are desirable.

A most important question is prompted by an examination of the previous research: does generalization ever occur without programming? In the above research, generalization was not always evident. In fact, the highly discriminated effects of some operant programs were sometimes documented. We have seen that the behavior analysis literature describes various programs that have shown that generalization may be promoted or programmed by particular intervention techniques. It seems reasonable to suggest, then, that many of the successful Train-and-Hope examples cited above may be undiagnosed instances of informal or inadvertent programming techniques, rather than an absence of programming techniques. It cannot be discounted, and is indeed possible, that these generalization examples may simply depict successful programmed generalization, and neither the authors of those papers, nor the present authors have recognized or hypothesized the programming technique.

Perhaps the most pragmatic orientation for behavior analysts is to assume that generalization does not occur except through some form of programming. Thus, the best course of action seems to be that of systematic measurement and analysis of variables that may have been functional in any apparently unprogrammed generalization. These analyses should be included as part of all research where “unprogrammed” generalized behavior changes are evidenced, for discriminated behavior changes may well be the rule if generalization is not specifically programmed. Such analyses, if successful, will contribute to a technology of generalization by further developing the understanding of critical variables that function to produce generalization, and would further emphasize the need always to be concerned not only with generalization issues, but with the various techniques that accomplish generalization.

In other words, behavioral research and practice should act as if there were no such animal as “free” generalization-as if generalization never occurs “naturally”, but always requires programming. Then, “programmed generalization” is essentially a redundant term, and snould be descriptive only of the active regard of researchers and practitioners.

References

Allen, G. J. Case study: Implementation of behavior modification techniques in summer camp settings. Behavior Therapy, 1973, 4, 570-575.
Ayllon, T. and Azrin, N. H. The token economy. New York: Appleton-Century-Crofts, 1968.
Azrin, N. H., Sneed, T. J., and Fox, R. M. Dry-bed: A rapid method of eliminating bedwetting (enuresis) of the retarded. Behaviour Research and Therapy, 1973, 11, 427-434.
Baer, D. M., Peterson, R. F., and Sherman, J. A. The development of imitation by reinforcing behavioral similarity to a model. journal of the Experimental Analysis of Behavior, 1967, 10, 405-416.
Baer, D. M. and Wolf, M. M. The entry into natural communities of reinforcement. In R. Ulrich, T. Stachnik, and J. Mabry (Eds), Control of human behavior: Volume II. Glenview, Illinois: Scott, Foresman, 1970. Pp. 319-324.
Baer, D. M., Wolf, M. M., and Risley, T. R. Some current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis, 1968, 1, 91-97.
Bolstad, 0. D. and Johnson, S. M. Self-regulation in the modification of disruptive behavior. Journal of Applied Behavior Analysis, 1972, 5, 443-454.
Broden, M., Hall, R. V., and Mitts, B. The effect of self-recording on the classroom behavior of two eighth-grade students. Journal of Applied Behavior Analysis, 1971, 4, 191-199.
Broden, M., Bruce, C., Mitchell, M. A., Carter, U., and Hall, R. V. Effects of teacher attention on attending behavior of two boys at adjacent desks. Journal of Applied Behavior Analysis, 1970, 3, 199-203.
Buehler, R. E., Patterson, G. R., and Furniss, J. M. The reinforcement of behavior in institutional settings. Behaviour Research and Therapy, 1966, 4, 157-167.
Buell, J., Stoddard, P., Harris, F. R., and Baer, D. M. Collateral social development accompanying reinforcement of outdoor play in a preschool child. Journal of Applied Behavior Analysis, 1968, 1, 167-173.
Cofer, C. N. and Foley, J. P. Mediated generalization and the interpretation of verbal behavior: I. Prolegomena. Psychological Review, 1942, 49, 513-540.
Drabman, R. S., Spitalnik, R., and O’Leary, K. D. Teaching self control to disruptive children. Journal of Abnormal Psychology, 1973, 82, 10- 16.
Ferster, C. B. and Skinner, B. F. Schedules of reinforcement. New York: Appleton-Century-Crofts, 1957.
Garcia, E. The training and generalization of a conversational speech form in nonverbal retardates. Journal of Applied Behavior Analysis, 1974, 7, 137-149.
Garcia, E., Baer, D. M., and Firestone, I. The development of generalized imitation within topographically determined boundaries. Journal of Applied Behavior Analysis, 1971, 4, 101-112.
Gelfand, D. M., Gelfand, S., and Dobson, W. R. Unprogrammed reinforcement of patients behavior in a mental hospital. Behaviour Research and Therapy, 1967, 5, 201-207.
Glynn, E. L., Thomas, J. D., and Shee, S. M. Behavioral self-control of on-task behavior in an elementary classroom. Journal of Applied Behavior Analysis, 1973, 6, 105-113.
Goetz, E. M. and Baer, D. M. Social control of form diversity and the emergence of new forms in children’s blockbuilding. Journal of Applied Behavior Analysis, 1973, 6, 105-113.
Graubard, P. S., Rosenberg, H., and Miller, M. B. Student applications of behavior modification to teachers and environments or ecological approaches to social deviancy. In E. A. Ramp and
B. L. Hopkins (Eds), A new direction for education: behavior analysis. Lawrence, Kansas: Support and Development Center for Follow Through, 1971. Pp. 80-101.
Griffiths, H. and Craighead, W. E. Generalization in operant speech therapy for misarticulation. Journal of Speech and Hearing Disorders, 1972, 37, 485-494.
Guess, D., Sailor, W., Rutherford, G., and Baer, D. M. An experimental analysis of linguistic development: the productive use of the plural morpheme. Journal of Applied Behavior Analysis, 1968, 1, 297-306.
Hall, R. V. and Broden, M. Behavior changes in brain-injured children through social reinforcement. Journal of Experimental Child Psychology, 1967, 5, 463-479.
Herbert, E. W. and Baer, D. M. Training parents as behavior modifiers: self-recording of contingent attention. Journal of Applied Behavior Analysis, 1972, 5, 139-149.
Horner, R. D. Establishing use of crutches by a mentally retarded spina bifida child. Journal of Applied Behavior Analysis, 1971, 4, 183-189.
Israel, A. C. and O’Leary, K. D. Developing correspondence between children’s words and deeds. Child Development, 1973, 44, 575-581.
Johnston, J. M. and Johnston, G. T. Modification of consonant speech-sound articulation in young children. Journal of Applied Behavior Analysis, 1972, 5, 233-246.
Kazdin, A. E. The effect of vicarious reinforcement on attentive behavior in the classroom. Journal of Applied Behavior Analysis, 1973, 6, 71-78.
Kazdin, A. E. and Polster, R. Intermittent token reinforcement and response maintenance in extinction. Behavior Therapy, 1973, 4, 386-391.
Keller, F. S. and Schoenfeld, W. N. Principles of psychology. New York: Appleton-Century-Crofts, 1950.
Kifer, R. E., Lewis, M. A., Green, D. R., and Phillips, E. L. Training predelinquent youths and their parents to negotiate conflict situations. Journal of Applied Behavior Analysis, 1974, 7, 357-364.
Koegel, R. L. and Rincover, A. Treatment of psychotic children in a classroom environment: I. Learning in a large group. Journal of Applied Behavior Analysis, 1974, 7, 45-59.
Meichenbaum, D. H., Bowers, K. S., and Ross, R. R. Modification of classroom behavior of institutionalized female adolescent offenders. Behaviour Research and Therapy, 1968, 6, 343-353.
Parsonson, B. S., Baer, A. M., and Baer, D. M. The application of generalized correct social contingencies by institutional staff: an evaluation of the effectiveness and durability of a training program. Journal of Applied Behavior Analysis, 1974, 7, 427-437.
Pendergrass, V. E. Timeout from positive reinforcement following persistent high-rate behavior in retardates. Journal of Applied Behavior Analysis, 1972, 5, 85-91.
Redd, W. H. and Birnbrauer, J. S. Adults as discriminative stimuli for different reinforcement contingencies with retarded children. Journal of Experimental Child Psychology, 1969, 7, 440- 447.
Rincover, A. and Koegel, R. L. Setting generality and stimulus control in autistic children. Journal of Applied Behavior Analysis, 1975, 8, 235-246.
Risley, T. R. and Hart, B. M. Developing correspondence between the nonverbal and verbal behavior of preschool children. Journal of Applied Behavior Analysis, 1968, 1, 267-281.
Rogers-Warren, A. and Baer, D. M. Correspondence between saying and doing: teaching children to share and praise. Journal of Applied Behavior Analysis, 1976, 9, 335-354.
Schroeder, G. L. and Baer, D. M. Effects of concurrent and serial training on generalized vocal imitation in retarded children. Developmental Psychology, 1972, 6, 293-301.
Schumaker, J. and Sherman, J. A. Training generative verb usage by imitation and reinforcement procedures. Journal of Applied Behavior Analysis, 1970, 3, 273-287.
Schwarz, M. L. and Hawkins, R. P. Application of delayed reinforcement procedures to the behavior of an elementary school child. Journal of Applied Behavior Analysis, 1970, 3, 85-96.
Seymour, F. W. and Stokes, T. F. Self-recording in training girls to increase work and evoke staff praise in an institution for offenders. Journal of Applied Behavior Analysis, 1976, 9, 41-54.
Skinner, B. F. Science and human behavior. New York: Macmillan, 1953.
Solomon, R. W. and Wahler, R. G. Peer reinforcement control of classroom problem behavior. Journal of Applied Behavior Analysis, 1973, 6, 49-56.
Stokes, T. F. and Baer, D. M. Preschool peers as mutual generalization-facilitating agents. Behavior Therapy, 1976, 7, 549-556.
Stokes, T. F., Baer, D. M., and Jackson, R. L. Programming the generalization of a greeting response in four retarded children. Journal of Applied Behavior Analysis, 1974, 7, 599-610.
Stolz, S. B. and Wolf, M. M. Visually discriminated behavior in a “blind” adolescent retardate. Journal of Applied Behavior Analysis, 1969, 2, 65-77.
Wahler, R. G. Setting generality: some specific and general effects of child behavior therapy. Journal of Applied Behavior Analysis, 1969, 2, 239-246.
Walker, H. M. and Buckley, N. K. The use of positive reinforcement in conditioning attending behavior. Journal of Applied Behavior Analysis, 1968, 1, 245-250.
Yates, A. J. Behavior therapy. New York: John Wiley and Sons, 1970.

Received 22 December 1975.
(Final acceptance 3 June 1976.)

Expansion of Professional Development & Collaboration Groups

Oct 25, 2021

TUT’s Professional Development Academy continues to expand and improve delivery of services. The latest developments in training and professional development will ensure our staff provides the highest quality services, translating into the greatest outcomes for our clients.

TUT has partnered with Relias – an organization dedicated to training ABA professionals to improve performance and quality outcomes for the clients they serve. Relias will provide TUT with a robust learning management system within which all clinical staff can enroll themselves and access video-based educational modules (on demand) in ABA and related disciplines. TUT Technician Training was recently rolled out as a targeted training program for Behavior Technicians. Acknowledging the importance of highly skilled Behavior Technicians, TUT Technician Training provides the content and structure for a BCBA to efficiently train a Behavior Technician to meet BACB® established performance standards. Moreover, it affords a BCBA the flexibility to pivot the focus of training to suit the nuanced needs of their client – allowing for more strategic and targeted training.

TUT rolls out weekly collaboration groups to ensure best practices.

In another effort to ensure our clinical staff is delivering the highest quality ABA therapy, TUT’s Clinical division has implemented weekly Collaboration Groups for BCBAs. These groups provide opportunities to discuss ABA topics, seek support on cases, discuss TUT Best Practices, provide support to others, obtain resources on topics, and be a part of a community of BCBAs who might otherwise not have opportunities to collaborate. These weekly Collaboration Groups were designed to:

Provide continuous clinical development opportunities for TUT BCBAs
Provide opportunities to reorient perspectives on child progress across Milestones, Barriers, and Transitions
Allow BCBAs to take initiative to join collaboration groups if children’s scores begin to plateau and/or diverge
Provide opportunities for BCBAs, Senior BCBAs and Clinical Directors to come together and meet

These weekly sessions will certainly have a positive impact on TUT clinical staff, and most importantly, on the children we serve.

ABA Services Continue to Improve

Mar 30, 2021

TUT is excited to partner with Relias to provide even more ways to ensure the highest quality of ABA services and enable us to achieve better outcomes for the children and families we serve.

Ensuring the Highest Quality of ABA Services

TUT Training Academy adds Relias as another forum and medium for professional development – along with in services, workshops, and TUT Technician Training. Relias will provide TUT with a robust learning management system within which all clinical staff can enroll themselves and access video-based educational modules (on demand) in ABA and related disciplines. New therapists and BCBAs are required to complete specific module sequences as part of orientation and seasoned staff can learn about disciplines other than ABA, expanding their views of how to help our students and enhance how they deliver services. The modules are also used to augment how BCBAs train therapists and validate skill mastery. CEU credits can also be earned!

TUT has partnered with Relias – an organization dedicated to training ABA professionals to improve performance and quality outcomes for the clients they serve.

In-Service Training for ABA

Oct 28, 2020

These in-service trainings are for school districts as well as for our professional staff. We provide training to continue to provide the highest quality of ABA services and enable us to achieve better outcomes for the children with autism and NJ families that we serve.

Powerpoints are available with each presentation.

“Applied Behavior Analysis: Practical Strategies for Paraprofessionals”

During this presentation you will learn: What a paraprofessional does, What a paraprofessional doesn’t do, Information about a Teacher/Paraprofessional relationship, How to help support a student with autism, Examples of visual aids, Example of communication (receptive language and expressive language), Examples of social stories, How to help prevent maladaptive behavior, The use of reinforcements and motivators, Organizational skills, and Common mistakes to avoid.

“De-escalating Behaviors”

Five stages of behavior
Functions of Behavior
Reinforcement vs. Punishment
Things to avoid and examples of how to work with each behavior

“Concepts and Strategies for Supporting Student Independence”

During this presentation learners will: Identify instructional concepts and strategies (prompting, modeling shaping, wait time, use of reinforcers, and fading of support), Apply these instructional concepts to scenarios as strategies to teach independence, Apply strategies to every-day instructional experience.

“Issues in Adolescence: The Importance of Sociosexual Education”

Topics discussed:

The philosophy of adolescent sexuality and why sociosexual education is important, Issues adolescents have during puberty, dating, and intimate relationships
Strategies to teach the “rules” of appropriate behavior and coping mechanisms to help the learner through adolescence.

“Developing Friendships”

Some aspects of this presentation include:

Models for facilitating friendships for children/adolescents/students in different settings
Tools for improving social skills (ex: joining a group, introducing oneself, greeting others, maintaining appropriate eye contact, etc.),
Affective behaviors (when hugs/affection are appropriate), Verbal communication (ex: expressing oneself, listening, verifying, considering the other person)
Nonverbal communication, Strategies for those that have greater difficulty with making friends (ex: building rapport between friends, video modeling, social stories, etc.)

“Behavior Analysis in the Preparatory Program”

The following presentation will discuss:

What is ABA? Interaction styles, Basic, environmental manipulations, Functions of behavior, Behavior chain (ABC)
Antecedent strategies based on behavior
Simple interventions
Evaluating the effectiveness of strategies, Simple data collection, How to determine function of behavior

“Functional Communication Training”

Aspects of this presentation include:

Utilizing learners’ preferences and teaching them to communicate their wants/needs, reducing inappropriate/maladaptive behaviors, understanding FCT and how it allows those with special needs to communicate utilizing their personal mode of communication (words/sentences, PECS, ASL, ACD).

“Applied Behavior Analysis: Practical Strategies for Paraprofessionals”

During this presentation our BCBA discusses:

What a paraprofessional does, Reminding paraprofessionals they are important, How to support students with disabilities, What is ABA and why we use it, How to define and identify behaviors, Knowing the ABC’s of ABA, Examples of visual aids, as well as, Functions of behavior.

“Addressing Behaviors: Recognizing the Why And What to Do About Them”

Discussion Topics:

Functions of Behavior, Reinforcement vs. Punishment, Disruptive Behaviors, Social Skill Deficits, Joint Attention, Early Intervention, Brain Development, Critical Social Skills & Direct Instruction, Teaching Strategies, General Strategies, as well as, Shadows in the preparatory program.

“An Introduction To Autism and Applied Behavior Analysis”

This presentation is geared towards Teachers and Paraprofessionals.

Topics discussed:

What is Autism?, Early signs of autism, Developmental milestones, How autism affects behavior, What is ABA?, Uses of ABA, A-B-C Concept, Behavior and Functions of Behavior, How to increase a behavior, Reinforcements, Collecting Data, Different kinds of prompts and Reducing Anxiety.

Additional topics/trainings

Addressing challenging behaviors for parents
Functional Communication
Prompting, Prompt Fading, and Reinforcement
Teaching Foundational and Skills with Discrete Trial Instruction
Behavioral Observations & Data Collection