BCBAs & Therapists Archives - The Uncommon Thread

Evidence-Based Performance Management: Applying Behavioral Science to Support Practitioners

Jan 5, 2026

Matthew D. Novak & Florence D. DiGennaro Reed & Tyler G. Erath & Abigail L. Blackman & Sandra A. Ruby & Azure J. Pellegrino
© Association for Behavior Analysis International 2019

Abstract

The science of behavior has effectively addressed many areas of social importance, including the performance management of staff working in human-service settings. Evidence-based performance management entails initial preservice training and ongoing staff support. Initial training reflects a critical first training component and is necessary for staff to work independently within an organization. However, investment in staff must not end once preservice training is complete. Ongoing staff support should follow preservice training and involves continued coaching and feedback. The purpose of this article is to bridge the research-to-practice gap by outlining research-supported initial training and ongoing staff support procedures within human-serving settings, presenting practice guidelines, and sharing information about easy-to-implement ways practitioners may stay abreast of current research.

Keywords Performance management . Staff training . Supervision

Portions of this manuscript were presented as an invited talk by Florence D. DiGennaro Reed at the OBM in Health and Human Services Conference, Denver, CO.

Florence D. DiGennaro Reed
fdreed@ku.edu

4001 Dole Human Development Center, Department of Applied Behavioral Science, University of Kansas, 1000 Sunnyside Avenue, Lawrence, KS 66045, USA
Published online: 26 November 2019

Train people well enough so they can leave, treat them well enough so they don’t
want to.
—Richard Branson

The science of behavior has effectively addressed many areas of social importance including education (e.g., Sulzer-Azaroff & Gillat, 1990), traffic safety (e.g., van Houten, Nau, & Marini, 1980; Yeaton & Bailey, 1978), substance use (e.g., Higgins, Silverman, & Heil, 2007), parent training (e.g., Lindgren et al., 2016; Phaneuf & McIntyre, 2007), behavioral medicine(e.g., Piazza, Milnes, & Shalev, 2015),and others. Perhaps the most widely known application is the behavioral treatment of autism (Freedman, 2016). Due to the research supporting its effectiveness (e.g., National Autism Center, 2009, 2015), autism treatment based on the principles of behavior analysis is endorsed by the U.S. Surgeon General (U.S. Department of Health & Human Services, 1999), National Institutes of Health (Strock, 2007), and the National Research Council (Lord & McGee, 2001). The model that has evolved for the provision of services to individuals with autism and other disabilities presents a challenge that is unique to those in health and human services. Whereas psychological therapy is delivered by licensed psychologists and medical procedures are performed by doctors, behavior analysts with the most amount of training and education (i.e., Board Certified Behavior Analysts®) often do not deliver services directly. Rather, they typically over see the provision of services delivered by others. Thus, the challenge for those providing behavioral services to vulnerable populations is ensuring that individuals with relatively less training and education (i.e., Registered Behavior Technicians® or Board Certified Assistant Behavior Analysts®) provide direct services in an effective and responsible manner.

To be successful, this unique service model necessitates effective performance management—the careful training and supervision of staff who implement behavioral treatment. Unfortunately, recent research suggests the field could benefit from improvements in this area. DiGennaro Reed and Henley (2015) administered a survey to certified and aspiring behavior analysts to determine the extent to which evidence-based staff training practices were implemented by their employers. The results were worrisome and launched efforts by the authors to disseminate recommended practices to practitioners and organizational leaders (e.g., DiGennaroReed , 2016, 2017, 2018, 2019; Henley, 2018, 2019). Only 55% of respondents indicated they received initial training after being hired; when training was provided, organizations relied on tactics with little empirical support, such as instruction alone. Fortunately, 71% of respondents reported they received ongoing training and support, but primarily in the forms of a lecture or monthly feedback, which are typically less effective than other forms of performance management. Respondents who worked as supervisors were also queried about the training they received to prepare them for the important responsibility of leading and managing staff. Only 33% indicated they received any formal training on how to effectively implement research-supported supervision practices. These results suggest training and performance management in settings that employ behavior analysts are not consistent with recommended practices. Many respondents began working without any formal training, received less-than-ideal ongoing support, and were expected to supervise staff without any guidance regarding best practices.

Many practicing behavior analysts lack sufficient training and ongoing support, which is worrisome because the lack of evidence-based staff training and performance management may lead to negative outcomes, including staff turnover, poor service quality, and charges of unethical conduct. Kazemi, Shapiro, and Kavner (2015) surveyed 100 behavior technicians from several companies and found that satisfaction with initial training, ongoing staff support, and supervisor behavior were key factors influencing respondents’ intent to quit their jobs. These results suggest that providing evidence-based staff training may help to mitigate staff turnover, which is reported to be as high as 75% within human-service organizations (Kazemi et al., 2015). Moreover, staff turnover places substantial financial strain on an organization, with average costs ranging from $5,000 for a single behavior technician to $10,000 for a certified behavior analyst (Sundberg, 2016).

In addition to affecting workforce stability, insufficient training and performance management may further disrupt the quality of behavioral services by influencing treatment integrity—the extent to which prescribed treatment procedures are implemented correctly (Peterson, Homer, & Wonderlich, 1982). In a review of 19 parametric analyses of treatment integrity, Brand, Henley, DiGennaro Reed, Gray, and Crabbs (2019) reported that treatment integrity errors produce unpredictable client outcomes. Although, in some instances, errors in behavioral treatment did not negatively influence client skill acquisition or problem behavior (e.g., Leon, Wilder, Majdalany, Myers, & Saini, 2014), in general researchers showed that integrity errors resulted in slower learner progress and less effective treatment. These findings underscore the importance of ensuring that staff implement treatment protocols as designed, which is fostered by the quality of training and performance management staff experience.

Insufficient performance management practices may have consequences that extend beyond affecting direct services; they may carry severe consequences for behavior analysts individually as well as for the profession more broadly. To ensure quality provision of services, the Behavior Analyst Certification Board® (BACB®) Professional and Ethical Compliance Code for Behavior Analysts (BACB, 2014) outlines a set of guidelines that all behavior analysts must follow. For example, section 5.03 of the Code states, “If the supervisee does not have the skills necessary to perform competently, ethically, and safely, behavior analysts provide conditions for the acquisition of those skills” (p. 14). Thus, supervisors are obligated to provide staff with effective training. On the topic of ongoing staff support, section 5.06 states, “Behavior analysts design feedback and reinforcement systems in a way that improves supervisee performance. Behavior analysts provide documented, timely feedback regarding the performance of a supervisee on an ongoing basis” (p. 14). Together these statements indicate that behavior analysts must rely on evidence-based staff training and performance management procedures to ensure compliance with this code. Failure to do so may result in ethical charges against the credentialed behavior analyst, which could lead to loss of certification or state licensure.

Evidence-based performance management involves a program that includes initial preservice training and ongoing staff support. Initial training, often referred to as orientation or onboarding, reflects a critical first training component and is necessary for staff to work independently within an organization. However, investment in staff must not end once preservice training is complete. Ongoing staff support should follow preservice training and it involves supervisors providing continued coaching and feedback regarding the staff’s performance in one or more areas. The purpose of this article is to bridge the research-to-practice gap by outlining research-supported initial training and ongoing staff support procedures within human-service settings, presenting practice guidelines based on our experience and current research, and sharing information about ways to stay up to date with research.

Preservice Training

The natural first step to ensuring high-quality staff training is the delivery of preservice training, which allows new employees to learn relevant job skills in a controlled, distraction-free environment. Although orientation and onboarding procedures are ubiquitous across all forms of employment, the presence of training alone does not guarantee proficiency with job skills. To be most effective, preservice training should use empirically supported training techniques. One empirically supported technique is behavioral skills training (BST), which is a procedure used to train new skills through a package including instruction, modeling, rehearsal, and feedback (Parsons, Rollyson, & Reid, 2012). Behavioral skills training can be implemented individually or in groups and has been used to train several tasks in the human service industry, including discrete trial instruction (Sarokoff & Sturmey, 2004) and mand training (NigroBruzzi & Sturmey, 2010) among others. Each component of BST is outlined below.

Instruction

The first component of BST is instruction, which involves describing a target skill or behavior one expects the trainee to perform (Miltenberger, 2016). Trainers can deliver instruction by describing procedures vocally (e.g., lecture, discussion) or textually (e.g., written protocols). Written protocols can include text alone or text with supplemental components, such as diagrams and images (i.e., enhanced written instructions; Berkman, Roscoe, & Bourret, 2019; Graff & Karsten, 2012). All too often, preservice training ends after instruction alone (DiGennaro Reed & Henley, 2015). It is critical to know that instruction alone is insufficient in training new skills (Ducharme & Feldman, 1992). Instead, training should begin with instruction followed by the other components that comprise BST.

This is not to say that instruction does not influence performance. Indeed, there are several ways in which instruction can be delivered to influence trainees’ performance. For example, Henley, Hirst, DiGennaro Reed, Becirevic, and Reed (2017) compared the use of directive instructions (e.g., “you must”) to nondirective instructions (e.g., “you might consider”) on trainees’ responding under changing reinforcement schedules in a laboratory task. When presented with directive instructions, participants responded in accordance with the instructions, even when the instructions were inaccurate (i.e., instructions did not match the reinforcement schedules). Participants in the nondirective condition, however, responded in accordance with the reinforcement schedules, independent of the instructions. Studies on this topic suggest that, when creating instructions for staff, it is important to consider how instructions are delivered, as minor variations may affect performance.

There are isolated instances in the literature of participants correctly performing skills after detailed instruction alone. For example, Graff and Karsten (2012) taught staff to effectively conduct preference assessments using enhanced written instructions, which included a detailed data sheet, diagrams, and step-by-step instructions written without technical jargon. However, Shapiro, Kazemi, Pogosjana, Rios, and Mendoza (2016) only partially replicated Graff and Karsten’s findings and had to include additional training procedures to reach criterion performance. Instances of performing to criterion after instruction alone are rare and may depend on various factors, such as the skill being trained, the instruction format, and the trainee’s history. Because these several factors must align for instruction alone to be effective, it is prudent to implement all components of BST when conducting preservice training.

Practice Guidelines We recommend that trainers present clear and succinct instructions about procedures, as Jarmolowicz et al. (2008) demonstrated that treatment adherence is better when instructions are written in a conversational form than in technical language. Moreover, it may be helpful to provide jargon-free, written protocols with diagrams (Graff & Karsten, 2012) that staff can use during training and refer to later, if needed. In our consultation experience, we have observed that trainees often appreciate the rationale for certain procedures, and providing that information may help guard against performance drift, though research is still needed in this area. Finally, we recommend trainers and supervisors be transparent about specific behaviors that will be observed and measured. It is presumed that the behaviors that will be observed and measured are of the utmost importance and, thus, should not be kept a secret from trainees. Also, new staff members generally want to perform well and may become nervous about their responsibilities and being observed. In our experience, transparency about what is expected and how performance will be measured generally reduces trainees’ nervousness.

Modeling

The second component of BST is modeling, which involves an experienced staff person demonstrating perfect performance of a target skill or behavior one expects the trainee to imitate (Miltenberger, 2016). Modeling can take place in person or shown via video. Video modeling includes the advantages of conserving resources if used over time and being transportable, allowing trainees the opportunity to view the video model outside of the training session. Further, a live model may contain slight errors or inconsistencies across trainings that affect training outcomes; video modeling provides the added benefit of standardizing training procedures and ensuring that only desired models are shown (DiGennaro Reed, Blackman, Erath, Brand, & Novak, 2018; Shapiro & Kazemi, 2017).

Recent research has shown video modeling, in particular video modeling with voiceover instruction, to be more effective than written instructions alone—though feedback was required for some participants—when training new staff to implement discrete trial instruction and behavior reduction interventions (e.g., DiGennaro Reed, Codding, Catania, & Maguire, 2010; Giannakakos, Vladescu, Kisamore, & Reeve, 2016). Furthermore, staff effectively learned how to conduct preference assessments after receiving training comprised of video modeling with voiceover instruction (e.g., Delli Bovi, Vladescu, DeBar, Carroll, & Sarokoff, 2017).

Practice Guidelines When incorporating modeling in training, it is important to model the entire procedure for the trainee. In addition, we recommend standardizing the models across training episodes to ensure the relevant skills are being modeled consistently. The latter recommendation may be accomplished by relying on video models or ensuring live models receive written protocols to follow during training. Finally, to aid in generalization, modeling should entail multiple exemplars of the target skill (Moore & Fisher, 2007). DiGennaro Reed, Erath, Brand, and Novak (2019) provide resources for creating video models, which readers may find beneficial.

Rehearsal

The third component of BST, rehearsal—also referred to as practice or role play— involves having trainees practice a target skill (Miltenberger, 2016). Rehearsal can be arranged in multiple ways, such as using a trained researcher or confederate (e.g., Iwata et al., 2000; McGimsey, Greene, & Lutzker, 1995; Phillips & Mudford, 2008), another trainee who is acquiring the skill being rehearsed (e.g., Palmen, Didden, & Korzilius, 2010; Wallace, Doney, Mintz-Resudek, & Tarbox, 2004), a service recipient (e.g., Erbas, Tekin-Iftar, & Yucesoy, 2006), or by varying the number of rehearsal opportunities (e.g., Jenkins & DiGennaro Reed, 2016). An important distinction is that rehearsal alone does not guarantee high levels of performance; in fact, it may allow trainees to practice errors, which may impede acquisition (Ward-Horner & Sturmey, 2012). Thus, rehearsal is typically accompanied by performance feedback and continues until trainees achieve mastery (Reid, Parsons, & Green, 2011).

Practice Guidelines In suggesting guidelines for trainers, we offer that practice does not make perfect; rather, practice with feedback makes perfect. That is, feedback should be delivered immediately following each rehearsal opportunity; thus, rehearsal and feedback should occur in tandem. Feedback is described in more detail in the next section. To help trainees acquire all relevant skills, the trainer must engineer opportunities for the trainee to practice the entire procedure, respond to a range of client responses, receive feedback, and meet a mastery criterion. These opportunities include using confederates in-vivo or in an analog setting, and having the confederate behave in scripted or predetermined ways to ensure the trainee practices all the components of a procedure and responds to a range of behaviors. Allowing trainees to respond to many situations is important, especially when dealing with vulnerable populations and procedures that need to be implemented with high integrity.

Feedback

The fourth component of BST is feedback, which refers to information about past performance that specifies how the trainee can improve performance in the future (Miltenberger, 2016). At this stage of training, feedback should be delivered immediately after each rehearsal opportunity and specify steps performed correctly and steps requiring correction (DiGennaro Reed et al., 2018). Several reviews of the literature reveal there are various dimensions of feedback, such as its source (supervisor, peer), medium or mode (written, verbal), frequency (daily, weekly), recipients (individuals, groups), privacy (private, public), and content (type of information provided; Alvero, Bucklin, & Austin, 2001; Balcazar, Hopkins, & Suarez, 1985; Prue & Fairbank, 1981). In addition, the way in which corrective and positive feedback are sequenced can influence its efficacy (Henley & DiGennaro Reed, 2015; Slowiak & Lakowske, 2017). Although supervisors and trainers have flexibility regarding how to implement the various dimensions of feedback, it is likely most efficient and effective to deliver both corrective and positive verbal feedback immediately upon completion of rehearsal.

Practice Guidelines The numerous variations and dimensions of feedback present a challenge to researchers seeking to examine the most effective combinations. Thus, no practice recommendations with an exact combination of methods can be made, which may be impractical as most supervisors are subject to various organizational and logistical constraints that limit their control over certain factors. However, research on the dimensions of feedback and its delivery has produced results with enough consistency to support general practice guidelines.

We recommend trainers base their feedback on direct observations rather than verbal reports or indirect measures. The use of observational data provides important information for the trainer and may also be used as a data source for delivering feedback. Next, trainee performance may improve faster if corrective feedback is presented before positive feedback (Henley & DiGennaro Reed, 2015); however, we often ask trainees to specify their preference for the order in which feedback is delivered. Third, supervisors should deliver feedback immediately after performance, as research suggests immediate feedback is more effective than delayed feedback (Goodman, Brady, Duffy, Scott, & Pollard, 2008) and staff prefer immediate feedback compared to delayed feedback (Reid & Parsons, 1996). We also recommend that feedback is used in conjunction with the components of BST described previously. Finally, it is important that feedback is delivered in a respectful and professional manner to maintain a positive working relationship, in particular when corrective feedback must be shared.

Ongoing Staff Support

It would be a mistake to assume that, once preservice training is complete, staff will perform all skills with high integrity while on the job and in a complex real-world environment. Thus, high-quality preservice training should be viewed as the initial investment in the professional development of staff. Based on decades of research, it is reasonable to assume that staff may not generalize skills learned in a contrived training environment to the actual work setting or maintain high levels of performance over extended periods of time. To address these challenges and support the maintenance and growth of staff skills, ongoing coaching is necessary. In the following sections, we outline how supervisors can provide ongoing support to staff following preservice training.

Observations and Feedback

The next step in the training process is the provision of ongoing supervision and support. This usually involves supervisory observations of the staff implementing varied procedures. The purpose of direct observations is twofold: to provide the supervisor with opportunities to monitor treatment integrity in the natural environment and to use data to provide performance feedback to staff. Depending on contextual variables operating in the organization (e.g., staff schedules, supervisor schedules, the setting), these observations can vary widely in when, how, and how often they are employed. For example, observations can be scheduled in advance or unscheduled drop-ins (when); viewed live (i.e., in-vivo) or via video recordings (how); and conducted as often or as little as needed, depending on staff performance, feasibility, and other related variables (how often).

Although variability exists regarding how supervisory observations are implemented, one commonality across all variations is to ensure recommended practices are being used. Table 1 depicts an on-the-job training protocol (adapted from Ricciardi, 2005) for using recommended practices to conduct ongoing observations. This protocol uses a competency-based approach to training (e.g., Reid & Parsons, 2002), and shares many of the same components used in initial training procedures (e.g., instruction, rehearsal, performance feedback). Thus, divergence from initial training procedures is more in relation to the training process, not training content, wherein the goal of ongoing supervision observations is to provide practice opportunities for staff to demonstrate high levels of treatment integrity in their actual workplace setting. The provision of supervision and support in this capacity also affords other benefits, such as the opportunity for supervisors and staff to troubleshoot issues regarding implementation in practice—an issue which may not have been present in the controlled training settings where initial training typically occurs. Related to this, supervisors can also embed coaching procedures (e.g., prompting, prompt fading, reinforcement) into their observations to facilitate staff implementation at mastery levels.

Table 1 On-the-job training protocol (adapted from Ricciardi, 2005)

Step	Implementation guidelines
Verbal instruction	□ Meet with the trainee. □ Verbally review training checklist item-by-item, answer questions, and provide clarification if needed.
Written instruction	□ Provide the trainee with a copy of the training checklist. □ Verbally describe how the checklist is a resource tool.
Trainee observations	□ On-the-job demonstrations are preferred over role-play scenarios. □ Minimize reactivity during observation. □ Record data on trainee performance for each skill.
Deliver feedback	□ Review performance on each skill providing positive and corrective feedback, as needed. □ Provide praise for steps and skills implemented correctly. □ Provide corrective feedback for steps and skills implemented incorrectly, along with information on how to implement correctly during future performance. □ Practice skills performed incorrectly via role play, model correct performance when needed. □ Ask the trainee if they have any questions and answer.
Repeat steps until mastery achieved	□ Observations should occur over time (e.g., across days, shifts). □ Conduct observation probes intermittently to assess performance.

Pay for Performance

An important consideration for maintaining desirable performance is the method by which staff are compensated for their work. Most compensation systems in the United States involve pay-for-time systems, in which compensation is primarily based on the amount of time spent at work, not performance of job duties. The prevalence of payfor-time systems presents an interesting challenge for behavior-analyst supervisors, as the primary contingency (i.e., pay) is delivered largely independent of performance of job duties. Although pay-for-performance systems likely maintain higher quality and quantity work than pay-for-time, several factors impede implementation of the pay-for performance systems.

The first barrier to performance-contingent pay is that many supervisors may not be in a position in their organization where they are able to dramatically change existing pay structures. Those supervisors who have control over organizational pay structures would also be likely to experience resistance from staff due to the ubiquity of traditional pay structures. A second barrier to pay-for-performance systems is a concern that these systems may place undue stress on employees—because their income is tied directly to their performance (Ganster, Kiersch, Marsh, & Bowen, 2011). This issue is made worse when incentive systems are dependent on factors beyond staff control.

Notwithstanding these barriers, supervisors must identify methods for reinforcing desirable performance within a traditional pay-for-performance system. A large body of research on the use of incentives in organizational settings suggests that desirable staff performance can be maintained by monetary incentives that account for a relatively small proportion of total compensation for staff (e.g., Dickinson & Gillette, 1994). That is, desired performance is maintained by the presence of an incentive contingency, not the percentage of incentive pay (Poling, Dickinson, Austin, & Normand, 2000). Thus, one solution may be for supervisors to use monetary incentives in conjunction with traditional pay systems—readers in organizations capable of and interested in providing performance-based pay are directed to Abernathy (1996, 2014) for additional resources. Note that this type of system is not without limitations. First, monetary incentives must be arranged so they are sustainable over an extended duration, and many human-service settings simply do not have the funds to maintain these efforts. Second, monetary incentives may be subject to certain federal and state labor regulations (e.g., Fair Labor Standards Act of 1938), which may create unwanted logistical difficulties for an organization. Thus, despite the advantages of using a generalized conditioned reinforcer in the form of monetary incentives, supervisors may seek alternative forms of incentive delivery to maintain desirable staff performance.

Identifying Potential Incentives Although supervisors may feel confident in their ability to select effective nonmonetary incentives for their staff, we urge caution in doing so. Wilder, Rost, and McMahon (2007) asked supervisors to rank-order potential incentives based on how effective they thought the incentive would be for each of their employees. When comparing supervisor predictions with employee rankings, Wilder et al. observed high agreement between supervisors’ indication of their employees’ most preferred items or activities; however, few supervisors accurately predicted moderate- and lower-preferred items or activities. These findings were replicated by Wilder, Harris, Casella, Wine, and Postma (2011) and suggest that supervisors generally do a poor job at predicting effective incentives.

Daniels and Bailey (2014) offer a systematic procedure to assist with selecting incentives. First, supervisors should consider what types of incentives they can reasonably provide. At this stage it is important to consider that incentives may be a mix of items, activities, or privileges. Whereas monetary and most tangible incentives have an inherent cost to an organization, many activities or privileges may come at little or no cost. For example, Iwata, Bailey, Brown, Foshee, and Alpern (1976) maintained high levels of daily care and training activities from staff working in a residential facility using an incentive where staff could rearrange their days off for the following week. Another important consideration at this stage is that supervisors should select items or activities that are sustainable for use for the indefinite future. Although reinforcement schedules may be thinned over time, the incentive program is only effective for as long as it is being implemented. Supervisors may need to have conversations with their organization’s management to identify incentives that the organization allows and can support—so the supervisor does not have to pay for it out-of-pocket.

Second, supervisors may wish to discuss these options for potential incentives with staff to identify a small number of items, activities, or privileges. This step should be conducted after first identifying potential items and activities because, otherwise, employees do not know what is available or may ask for things that cannot be delivered (see Daniels & Bailey, 2014).

Finally, after potential incentives have been identified, supervisors should conduct a systematic preference assessmentwith staff.Although thereis a richliteratureevaluating methods for assessing preference in clinical populations with limited verbal repertoires (e.g., Fisher et al., 1992), these methods are likely not appropriate for staff with strong verbal repertoires (see Waldvogel & Dixon, 2008). Two commonly used methods for assessing employee preference for incentives are a reinforcer survey (Daniels & Bailey, 2014) and a ranking procedure (Waldvogel & Dixon, 2008; Wine, Reis, & Hantula, 2014). The reinforcer survey asks employees to rate how much work they would be willing to do for each item or activity on a Likert-type scale from 0 (none at all) to 4 (very much). The ranking procedure is similar but provides a relative value for each potential incentive by asking employees to rank the items or activities from least to most preferred. Both the survey and ranking procedure provide relatively accurate indications of effective incentives—although the survey method may be better at identifying a greater range of effective incentives (see Wine et al., 2014 for a comparison).

Incentive Delivery A final consideration for ensuring effective use of incentives is the way they are delivered. Some variables that warrant additional consideration for use with staff include incentive quality, probability of delivery, and delay to delivery. Incentive quality is largely determined by preference, although incentive magnitude and schedule also affect quality. Incentives may be delivered on a dense schedule initially and thinned once the employee has consistently demonstrated desired performance. Two considerations with respect to quality are that staff preferences may shift over time (Wine, Gilroy, & Hantula, 2012) and the use of incentives of varied preference may maintain high levels of performance (Wine & Wilder, 2009). Thus, supervisors should consider assessing preference on a regular basis and using a variety of moderate- to high-preferred incentives.

Although rules can help bridge the gap of delays to reinforcement, supervisors should seek to minimize delays as much as possible. In addition, with respect to the probability of incentive delivery, reinforcers may not need to be delivered each time. Though there is little research assessing probability of incentive delivery in OBM settings, several studies have demonstrated the effectiveness of lottery systems at maintaining performance (e.g., Cook & Dixon, 2006; Iwata et al., 1976). Taken together, there has been little OBM research assessing the relative impact of reinforcer dimensions. However, recent research incorporating the area of behavioral economics has provided a promising methodology through which a thorough understanding may be developed (e.g., Henley, DiGennaro Reed, Reed, & Kaplan, 2016; Wine et al., 2012).

Assessing Performance Problems

Despite best efforts to train and provide follow-up support to staff, supervisors will likely encounter instances of less-than-desirable performance. As is true with service delivery in clinical populations, performance management interventions typically provide benefit when they are informed by preintervention functional assessment. Though several informant assessments exist and have been implemented with success (e.g., Daniels & Lattal, 2017; Mager & Pipe, 1984), the Performance Diagnostic Checklist– Human Services (PDC–HS; Carr, Wilder, Majdalany, Mathisen, & Strain, 2013) has emerging support for use in human-service settings. The PDC–HS may be especially advantageous for BCBAs in supervisory positions who may have limited OBM experience as it is less time consuming and does not require the same level of expertise as behavioral systems analysis (for more information about behavioral systems analysis, see Johnson, Casella, McGee, & Lee, 2014; McGee & Diener, 2010; Sigurdsson & McGee, 2015). The PDC–HS is an informant assessment that may be used to identify variables causing or maintaining substandard performance. The assessment consists of 20 questions across four domains: (a) training; (b) task clarification and prompting; (c) resources, materials, and processes; and (d) performance consequences, effort, and competition. As its name suggests, the PDC–HS was developed specifically for use in human service settings, and the questions and domains were developed to target environmental variables that commonly affect performance in this particular setting.

The first section, training, contains questions about the type of training received and whether the employee has demonstrated desired levels of performance in the past. The second section, task clarification and prompting, includes questions about the employee’s knowledge of job requirements and the environment where the behavior is expected to occur. Third, the resources, materials, and processes section contains questions about the organization’s systems and resources that may be beyond the employee’s control. Finally, the performance consequences, effort, and competition section includes questions about overall supervisor support and supervision, feedback, and potential competing activities.

Carr et al. (2013) indicated that seven of the items on the PDC–HS can be answered through direct observation and recommended administering the remaining 13 items through discussion with the employee’s supervisor. Practitioners administering the PDC–HS might consider also administering it with multiple supervisors or across different levels of the organization (e.g., management, supervisors, employees) as there may not be 100% agreement across all relevant parties (e.g., Merritt, DiGennaro Reed, & Martinez, 2019).

When complete, results of the PDC–HS are scored by counting the number of items for which an area of concern was flagged in each domain (see Carr & Wilder, 2016; Carr et al., 2013); the domain with the most flagged items typically indicates the “function” of the performance problem, or the area to be targeted for intervention. Where the PDC–HS may be particularly useful is the accompanying intervention planning guide, which provides a list of recommended interventions (with supporting literature) for each domain. For example, Bowe and Sellers (2018) conducted a PDC– HS to assess inaccurate implementation of error correction procedures during teaching sessions, and results indicated that insufficient training was likely the greatest contributing factor for performance problems. The experimenters then compared an indicated intervention of BST with an intervention recommended in the task clarification and prompting domain (posting reminders; i.e., a nonindicated intervention) and found that performance improved only following the indicated intervention. In sum, the PDC–HS is particularly beneficial for supervisors in human service settings as it provides a systematic assessment of performance problems, is tailored specifically for use in human service settings, and helps supervisors select a function-indicated intervention.

Continuing Education for Staff

As employees continue their professional growth at an organization, their interests may begin to extend beyond the scope of what a supervisor is able to provide. These interests may be viewed as an opportunity to develop skills by facilitating additional learning opportunities. Many of the approaches described in the section below on staying up to date with research can also be adapted for staff development. For example, a supervisor may encourage staff to attend regional conferences or workshops by providing time off work or assisting with registration costs. Those who oversee a large number of staff may consider inviting speakers to deliver workshops on topics related to the services the organization provides. By facilitating continuing education opportunities, supervisors create varied learning opportunities for staff while also creating natural rewards for staff who stay with an organization.

Staying Up to Date in Performance Management Research

Staying current with the research literature is an important skill for practitioners as it may foster the provision of high-quality supervision, training, and services across all levels of the organization. Section 1.01 of the Professional and Ethical Compliance Code includes a provision that states “behavior analysts rely on professionally derived knowledge based on science and behavior analysis when making scientific or professional judgements in human service provision, or when engaging in scholarly or professional endeavors” (BACB, 2014, p. 4). To meet this standard, credentialed behavior analysts must remain abreast of current scientific findings. At the broadest level, this often involves various classes of behavior that can be categorized into two areas: (a) attending professional development opportunities and (b) reading peerreviewed publications in staff training and performance management.

Attending professional development opportunities often means attending local, regional, and/or international behavior-analytic conferences and workshops to learn about recent developments in a given area. With current technology, webinars, onlinebased trainings, and podcasts may be options. For example, the OBM Network (http://obmnetwork.com) provides numerous webinars from excellent researchers in the field of OBM on topics such as feedback, remote supervision, pay-for performance, leadership, and staff turnover, among others.

Carr and Briggs (2010) offer helpful recommendations for overcoming barriers to staying abreast of the scholarly literature. In addition, a means of remaining up to date on latest publications is to use software- and internet-based tools for alerts on recently published literature in a journal or content area. For example, many journal providers offer table-of-content alerts, which present the user with a brief snapshot of the articles published in the newest issue of a journal. Pubcrawler (http://pubcrawler.gen.tcd.ie/) is an internet-based alert service that allows users to create personal queries for recently published articles, which are then compiled into a list and sent via email. A notable feature is that users can set their search criteria for the query using an extensive list of settings, including but not limited to commonly used components such as keywords and articles from particular journals. For example, if a user were interested in receiving updates for recent publications on staff training, the user could set up a search query by setting the search criteria for “staff training” as the keyword and a list of behavioranalytic journals as the places in which the query should search.

Given the expansive and ever-growing body of literature, it can be difficult to discern which literature is most relevant, in particular when many articles are accessible only through fees or university subscriptions. However, there are numerous methods to find articles that contain staff training or performance management content. One simple and straightforward way is to purchase access to behavior-analytic journals through the Association for Behavior Analysis International©, which can often be done at a discounted rate for members. Credentialed behavior analysts also have access to select journals through the online portal of the BACB. Another method is to use Google Scholar (https://scholar.google.com), where articles may be freely available. An additional benefit to using Google Scholar is that searches expand beyond the scope of just peer-reviewed articles, as their database also indexes books, conferences, and other nonpeer-reviewed materials. ResearchGate© (https://www.researchgate.net) is another tool for individuals to use to either download an article or reach out to the researcher directly. Finally, although many articles can be accessed using the methods described above, authors will typically be happy to send articles to those who reach out to them directly via email (often found on their university web page) or ResearchGate.

Conclusion

Recent research suggests that recommended staff training and performance management practices are not being regularly implemented in organizations that hire behavior analysts, which could affect the quality of services being provided. This article attempted to bridge the research-to-practice gap by outlining research-supported initial training and ongoing staff support procedures within human service settings, presenting practice guidelines, and sharing information about easy-to-implement ways practitioners may stay abreast of current research. Assessment and intervention procedures based on the science of behavior analysis have empirical support, but we must ensure they are being implemented with high integrity by a well-trained and supported workforce.

References

Abernathy, W. B. (1996). The sin of wages. Atlanta, GA: Performance Management.

Abernathy, W. B. (2014). Beyond the Skinner box: The design and management of organization-wide performance systems. Journal of Organizational Behavior Management, 34, 235–254. https://doi.org/10.1080/01608061.2014.973631

Alvero, A. M., Bucklin, B. R., & Austin, J. (2001). An objective review of the effectiveness and essential characteristics of performance feedback in organizational settings. Journal of Organizational Behavior Management, 21(1), 3–29. https://doi.org/10.1300/J075v21n01_02

Balcazar, F. E., Hopkins, B. L., & Suarez, Y. (1985). A critical, objective review of performance feedback. Journal of Organizational Behavior Management, 7(3–4), 65–89. https://doi.org/10.1300/J075v07n03_05

Behavior Analyst Certification Board. (2014). Professional and ethical compliance code for behavior analysts. Littleton, CO: Author. Retrieved from https://www.bacb.com/ethics/ethics-code/ (Accessed August 19, 2016).

Berkman, S. J., Roscoe, E. M., & Bourret, J. C. (2019). Comparing self-directed methods for training staff to create graphs using GraphPad Prism. Journal of Applied Behavior Analysis, 52, 188–204. https://doi.org/10.1002/jaba.522

Bowe, M., & Sellers, T. P. (2018). Evaluating the Performance Diagnostic Checklist–Human Services to assess incorrect error-correction procedures by preschool paraprofessionals. Journal of Applied Behavior Analysis, 51, 166–176. https://doi.org/10.1002/jaba.428

Brand, D., Henley, A. J., DiGennaro Reed, F. D., Gray, E., & Crabbs, B. (2019). A review of published studies involving parametric manipulations of treatment integrity. Journal of Behavioral Education, 28, 1–26. https://doi.org/10.1007/s10864-018-09311-8

Carr, J. E., & Briggs, A. M. (2010). Strategies for making regular contact with the scholarly literature. Behavior Analysis in Practice, 3, 13–18. https://doi.org/10.1007/BF03391760

Carr, J. E., & Wilder, D. A. (2016). The Performance Diagnostic Checklist–Human Services: A correction. Behavior Analysis in Practice, 9, 63. https://doi.org/10.1007/s40617-015-0099-3

Carr, J. E., Wilder, D. A., Majdalany, L., Mathisen, D., & Strain, L. A. (2013). An assessment-based solution to a human-service employee performance problem. Behavior Analysis in Practice, 6, 16–32. https://doi.org/10.1007/BF03391789

Cook, T., & Dixon, M. R. (2006). Performance feedback and probabilistic bonus contingencies among employees in a human service organization. Journal of Organizational Behavior Management, 25(3), 45–63. https://doi.org/10.1300/J075v25n03_04

Daniels, A. C., & Bailey, J. S. (2014). Performance management: Changing behavior that drives organizational performance (5th ed.). Atlanta, GA: Performance Management Publications.

Daniels, A. C., & Lattal, A. D. (2017). Life’s a PIC/NIC… when you understand behavior. Cornwall on Hudson, NY: Sloan Publishing.

Delli Bovi, G. M., Vladescu, J. C., DeBar, R. M., Carroll, R. A., & Sarokoff, R. A. (2017). Using video modeling with voice-over instruction to train public school staff to implement a preference assessment. Behavior Analysis in Practice, 10, 72–76. https://doi.org/10.1007/s40617-016-0135-y

Dickinson, A. M., & Gillette, K. L. (1994). A comparison of the effects of two individual monetary incentive systems on productivity: Piece rate pay versus base pay plus incentives. Journal of Organizational Behavior Management, 14(1), 3–82. https://doi.org/10.1300/J075v14n01_02

DiGennaro Reed, F. D. (2016, November). ABCs of staff support: Evidence-based performance management. Presentation for the Iowa Association for Behavior Analysis Annual Conference, Des Moines, IA.

DiGennaro Reed, F. D. (2017, May). Evidence-based performance management: Applying behavioral science to support practitioners. Presentation for the OBM in Health and Human Services Conference, Denver, CO.

DiGennaro Reed, F. D. (2018, May). Using behavioral science to support educators during consultation. Presentation for the annual meeting of the Association for Behavior Analysis International, San Diego, CA.

DiGennaro Reed, F. D. (2019, April). Performance management in ABA service settings. Presentation for the annual meeting of the Association for Professional Behavior Analysts, Atlanta, GA.

DiGennaro Reed, F. D., Blackman, A. L., Erath, T. G., Brand, D., & Novak, M. D. (2018). Guidelines for using behavioral skills training to provide teacher support. Teaching Exceptional Children, 50, 373–380. https://doi.org/10.1177/0040059918777241

DiGennaro Reed, F. D., Codding, R., Catania, C. N., & Maguire, H. (2010). Effects of video modeling on treatment integrity of behavioral interventions. Journal of Applied Behavior Analysis, 43, 291–295. https://doi.org/10.1901/jaba.2010.43-291

DiGennaro Reed, F. D., Erath, T. G., Brand, D., & Novak, M. D. (2019). Video modeling during coaching and performance feedback. In A. Fischer, E. Dart, T. Collins, & K. Radley (Eds.), Technology applications in school consultation, supervision, and school psychology training. New York, NY: Routledge.

DiGennaro Reed, F. D., & Henley, A. J. (2015). A survey of staff training and performance management practices: The good, the bad, and the ugly. Behavior Analysis in Practice, 8, 16–26. https://doi.org/10.1007/s40617-015-0044-5

Ducharme, J. M., & Feldman, M. A. (1992). Comparison of staff training strategies to promote generalized teaching skills. Journal of Applied Behavior Analysis, 25, 165–179. https://doi.org/10.1901/jaba.1992.25165

Erbas, D., Tekin-Iftar, E., & Yucesoy, S. (2006). Teaching special education teachers how to conduct functional analysis in natural settings. Education and Training in Developmental Disabilities, 41, 28–36. Fair Labor Standards Act of 1938. (n.d.). 29 U.S.C. §201 et seq.

Fisher, W., Piazza, C. C., Bowman, L. G., Hagopian, L. P., Owens, J. C., & Slevin, I. (1992). A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilities. Journal of Applied Behavior Analysis, 25, 491–498. https://doi.org/10.1901/jaba.1992.25-491

Freedman, D. H. (2016). Improving public perception of behavior analysis. The Behavior Analyst, 39, 89–95. https://doi.org/10.1007/s40614-015-0045-2

Ganster, D. C., Kiersch, C. E., Marsh, R. E., & Bowen, A. (2011). Performance-based rewards and work stress. Journal of Organizational Behavior Management, 31, 221–235. https://doi.org/10.1080/01608061.2011.619388

Giannakakos, A. R., Vladescu, J. C., Kisamore, A. N., & Reeve, S. A. (2016). Using video modeling with voiceover instruction plus feedback to train staff to implement direct teaching procedures. Behavior Analysis in Practice, 9, 126–134. https://doi.org/10.1007/s40617-015-0097-5

Goodman, J. I., Brady, M. P., Duffy, M. L., Scott, J., & Pollard, N. E. (2008). The effects of “bug-in-ear” supervision on special education teachers’ delivery of learn units. Focus on Autism and Other Developmental Disabilities, 23, 207–216. https://doi.org/10.1177/1088357608324713

Graff, R. B., & Karsten, A. M. (2012). Evaluation of a self-instruction package for conducting stimulus preference assessments. Journal of Applied Behavior Analysis, 45, 69–82. https://doi.org/10.1901/jaba.2012.45-69

Henley, A. J. (2018). Empirical foundations and practical applications of behavioral skills training. Invited presentation and workshop for Behavior Services of Western Massachusetts, Springfield, MA.

Henley, A. J. (2019). Supervising like a boss: Healthy behavioral practices for promoting effective supervision. Invited presentation for the Massachusetts Association of 766 Approved Private Schools, Marlborough, MA.

Henley, A. J., & DiGennaro Reed, F. D. (2015). Should you order the feedback sandwich? Efficacy of feedback sequence and timing. Journal of Organizational Behavior Management, 35, 321–335. https://doi.org/10.1080/01608061.2015.1093057

Henley, A. J., DiGennaro Reed, F. D., Reed, D. D., & Kaplan, B. A. (2016). A crowdsourced nickel-and-dime approach to analog OBM research: A behavioral economic framework for understanding workforce attrition. Journal of the Experimental Analysis of Behavior, 106, 134–144. https://doi.org/10.1002/jeab.220

Henley, A. J., Hirst, J. M., DiGennaro Reed, F. D., Becirevic, A., & Reed, D. D. (2017). Function-altering effects of rule phrasing in the modulation of instructional control. The Analysis of Verbal Behavior, 33, 24–40. https://doi.org/10.1007/s40616-016-0063-5

Higgins, S. T., Silverman, K., & Heil, S. H. (Eds.). (2007). Contingency management in substance abuse treatment. New York, NY: Guilford Press.

Iwata, B. A., Bailey, J. S., Brown, K. M., Foshee, T. J., & Alpern, M. (1976). A performance-based lottery to improve residential care and training by institutional staff. Journal of Applied Behavior Analysis, 9, 417– 431. https://doi.org/10.1901/jaba.1976.9-417

Iwata, B. A., Wallace, M. D., Kahng, S., Lindberg, J. S., Roscoe, E. M., Conners, J., et al. (2000). Skill acquisition in the implementation of functional analysis methodology. Journal of Applied Behavior Analysis, 33, 181–194. https://doi.org/10.1901/jaba.2000.33-181

Jarmolowicz, D. P., Kahng, S., Ingvarsson, E. T., Goysovich, R., Heggemeyer, R., & Gregory, M. K. (2008). Effects of conversational versus technical language on treatment preference and integrity. Intellectual & Developmental Disabilities, 46, 190–199. https://doi.org/10.1352/2008.46:190-199

Jenkins, S. R., & DiGennaro Reed, F. D. (2016). A parametric analysis of rehearsal opportunities on procedural integrity. Journal of Organizational Behavior Management, 36, 255–281. https://doi.org/10.1080/01608061.2016.1236057

Johnson, D. A., Casella, S. E., McGee, H., & Lee, S. C. (2014). The use and validation of pre-intervention diagnostic tools in Organizational Behavior Management. Journal of Organizational Behavior Management, 34, 104–121. https://doi.org/10.1080/01608061.2014.914009

Kazemi, E., Shapiro, M., & Kavner, A. (2015). Predictors of intention to turnover in behavior technicians working with individuals with autism spectrum disorder. Research in Autism Spectrum Disorders, 17, 106–115. https://doi.org/10.1016/j.rasd.2015.06.012

Leon, Y., Wilder, D. A., Majdalany, L., Myers, K., & Saini, V. (2014). Errors of omission and commission during alternative reinforcement of compliance: The effects of varying levels of treatment integrity. Journal of Behavioral Education, 23, 19–33. https://doi.org/10.1007/s10864-013-9181-5

Lindgren, S., Wacker, D., Suess, A., Schieltz, K., Pelzel, K., Kopelman, T., et al. (2016). Telehealth and autism: Treating challenging behavior at lower cost. Pediatrics, 137, S167–S175. https://doi.org/10.1542/peds.2015-2851O

Lord, C., & McGee, J. P. (Eds.). (2001). Educating children with autism. Washington, DC: National Academy Press, Committee on Educational Interventions for Children with Autism, Division of Behavioral and Social Sciences and Education, National Research Council.

Mager, R., & Pipe, P. (1984). Analyzing performance problems (2nd ed.). Belmont, CA: Lake.

McGee, H. M., & Diener, L. H. (2010). Behavioral systems analysis in health and human services. Behavior Modification, 34, 415–442. https://doi.org/10.1177/0145445510383527

McGimsey, J. F., Greene, B. F., & Lutzker, J. R. (1995). Competence in aspects of behavioral treatment and consultation: Implications for service delivery and graduate training. Journal of Applied Behavior Analysis, 28, 301–315. https://doi.org/10.1901/jaba.1995.28-301

Merritt, T. A., DiGennaro Reed, F. D., & Martinez, C. E. (in press). Using the Performance Diagnostic Checklist–Human Services to identify an indicated intervention to decrease employee tardiness. Journal of Applied Behavior Analysis, 52, 1034–1048. https://doi.org/10.1002/jaba.643

Miltenberger, R. G. (2016). Behavior modification: Principles and procedures (6th ed.). Boston, MA: Cengage Learning.

Moore, J. W., & Fisher, W. W. (2007). The effects of videotape modeling on staff acquisition of functional analysis methodology. Journal of Applied Behavior Analysis, 40, 197–202. https://doi.org/10.1901/jaba.2007.24-06

National Autism Center. (2009). National Standards Project: Findings and conclusions. Randolph, MA: Author.

National Autism Center. (2015). National Standards Project, Phase 2: Findings and conclusions. Addressing the need for evidence-based practice guidelines for autism spectrum disorder. Randolph, MA: Author.

Nigro-Bruzzi, D., & Sturmey, P. (2010). The effects of behavioral skills training on mand training by staff and unprompted vocal mands by children. Journal of Applied Behavior Analysis, 43, 757–761. https://doi.org/10.1901/jaba.2010.43-757

Palmen, A., Didden, R., & Korzilius, H. (2010). Effectiveness of behavioral skills training on staff performance in a job training setting for high-functioning adolescence with autism spectrum disorder. Research in Autism Spectrum Disorders, 4, 731–740. https://doi.org/10.1016/j.rasd.2010.01.012

Parsons, M. B., Rollyson, J. H., & Reid, D. H. (2012). Evidence-based staff training: A guide for practitioners. Behavior Analysis in Practice, 5, 2–11. https://doi.org/10.1007/BF03391819

Peterson, L., Homer, A. L., & Wonderlich, S. A. (1982). The integrity of independent variables in behavior analysis. Journal of Applied Behavior Analysis, 15, 477–492. https://doi.org/10.1901/jaba.1982.15-477

Phaneuf, L., & McIntyre, L. L. (2007). Effects of individualized video feedback combined with group parent training on inappropriate maternal behavior. Journal of Applied Behavior Analysis, 40, 737–741. https://doi.org/10.1901/jaba.2007.737-741

Phillips, K. J., & Mudford, O. C. (2008). Functional analysis training for residential caregivers. Behavioral Interventions, 23, 1–12. https://doi.org/10.1002/bin.252

Piazza, C. C., Milnes, S. M., & Shalev, R. A. (2015). A behavior-analytic approach to the assessment and treatment of pediatric feeding disorders. In H. S. Roane, J. E. Ringdahl, & T. S. Falcomata (Eds.), Clinical and organizational applications of applied behavior analysis (pp. 69–94). Cambridge, MA: Academic Press. https://doi.org/10.1016/B978-0-12-420249-8.00004-6

Poling, A., Dickinson, A. M., Austin, J., & Normand, M. P. (2000). Basic behavioral research and organizational behavior management. In J. Austin & J. E. Carr (Eds.), Handbook of applied behavior analysis (pp. 295–320). Reno, CA: Context Press.

Prue, D. M., & Fairbank, J. A. (1981). Performance feedback in organizational behavior management: A review. Journal of Organizational Behavior Management, 3(1), 1–16. https://doi.org/10.1300/J075v03n01_01

Reid, D. H., & Parsons, M. B. (1996). A comparison of staff acceptability of immediate versus delayed verbal feedback in staff training. Journal of Organizational Behavior Management, 16(2), 35–47. https://doi.org/10.1300/J075v16n02_03

Reid, D. H., & Parsons, M. B. (2002). Working with staff to overcome challenging behavior among people who have severe disabilities: A guide for getting support plans carried out. Morganton, NC: Habilitative Management Consultants.

Reid, D. H., Parsons, M. B., & Green, C. W. (2011). Evidence-based ways to promote work quality and enjoyment among support staff: Trainee guide. Washington, DC: American Association of on Intellectual and Developmental Disabilities.

Ricciardi, J. N. (2005). Achieving human service outcomes through competency-based training: A guide for managers. Behavior Modification, 29, 488–507. https://doi.org/10.1177/0145445504273281

Sarokoff, R. A., & Sturmey, P. (2004). The effects of behavioral skills training on staff implementation of discrete-trial teaching. Journal of Applied Behavior Analysis, 37, 535–538. https://doi.org/10.1901/jaba.2004.37-535

Shapiro, M., & Kazemi, E. (2017). A review of training strategies to teach individuals implementation of behavioral interventions. Journal of Organizational Behavior Management, 37, 32–62. https://doi.org/10.1080/01608061.2016.1267066

Shapiro, M., Kazemi, E., Pogosjana, M., Rios, D., & Mendoza, M. (2016). Preference assessment training via self-instruction: A replication and extension. Journal of Applied Behavior Analysis, 49, 794–808. https://doi.org/10.1002/jaba.339

Sigurdsson, S. O., & McGee, H. M. (2015). Organizational behavior management: Systems analysis. In H. S. Roane, J. E. Ringdahl, & T. S. Falcomata (Eds.), Clinical and organizational applications of applied behavior analysis (pp. 627–647). Cambridge, MA: Academic Press. https://doi.org/10.1016/B978-0-12420249-8.00025-3

Slowiak, J. M., & Lakowske, A. M. (2017). The influence of feedback statement sequence and goals on task performance. Behavior Analysis: Research & Practice, 17, 357–380. https://doi.org/10.1037/bar0000084

Strock, M. (2007). Autism spectrum disorders (pervasive developmental disorders). Bethesda, MD: National Institute of Mental Health. Retrieved from https://eric.ed.gov/?id=ED495219 Accessed 19 Aug 2019.

Sulzer-Azaroff, B., & Gillat, A. (1990). Trends in behavior analysis in education. Journal of Applied Behavior Analysis, 23, 491–495. https://doi.org/10.1901/jaba.1990.23-491

Sundberg, D. B. (2016) Why people quit your company and how behavior analysis can slow the revolving door at ABA service providers. Retrieved from https://bsci21.org/why-people-quit-your-company-andhow-behavior-analysis-can-slow-the-revolving-door-at-aba-service-providers/. Accessed 2 Nov 2016.

U.S. Department of Health & Human Services. (1999). Mental health: A report of the Surgeon General. Rockville, MD: U.S. Department of Health & Human Services, Substance Abuse and Mental Health Services Administration, Center for Mental Health Services, National Institutes of Health, National Institute of Mental Health.

Van Houten, R., Nau, P., & Marini, Z. (1980). An analysis of public posting in reducing speeding behavior on an urban highway. Journal of Applied Behavior Analysis, 13, 383–395. https://doi.org/10.1901/jaba.1980.13-383

Waldvogel, J. M., & Dixon, M. R. (2008). Exploring the utility of preference assessments in organizational behavior management. Journal of Organizational Behavior Management, 28, 76–87. https://doi.org/10.1080/01608060802006831

Wallace, M. D., Doney, J. K., Mintz-Resudek, C. M., & Tarbox, R. S. (2004). Training educators to implement functional analyses. Journal of Applied Behavior Analysis, 37, 89–92. https://doi.org/10.1901/jaba.2004.37-89

Ward-Horner, J., & Sturmey, P. (2012). Component analysis of behavior skills training in functional analysis. Behavioral Interventions, 27, 75–92. https://doi.org/10.1002/bin.1339

Wilder, D. A., Harris, C., Casella, S., Wine, B., & Postma, N. (2011). Further evaluation of the accuracy of managerial prediction of employee preference. Journal of Organizational Behavior Management, 31, 130–139. https://doi.org/10.1080/01608061.2011.569202

Wilder, D. A., Rost, K., & McMahon, M. (2007). The accuracy of managerial prediction of employee preference: A brief report. Journal of Organizational Behavior Management, 27(2), 1–14. https://doi.org/10.1300/J075v27n02_01

Wine, B., Gilroy, S., & Hantula, D. A. (2012). Temporal (in)stability of employee preferences for rewards. Journal of Organizational Behavior Management, 32, 58–64. https://doi.org/10.1080/01608061.2012.646854

Wine, B., Reis, M., & Hantula, D. A. (2014). An evaluation of stimulus preference assessment methodology in organizational behavior management. Journal of Organizational Behavior Management, 34, 7–15. https://doi.org/10.1080/01608061.2013.873379

Wine, B., & Wilder, D. A. (2009). The effects of varied versus constant high-, medium-, and low-preference stimuli on performance. Journal of Applied Behavior Analysis, 42, 321–326. https://doi.org/10.1901/jaba.2009.42-321

Yeaton, W. H., & Bailey, J. S. (1978). Teaching pedestrian safety skills to young children: An analysis and one-year follow-up. Journal of Applied Behavior Analysis, 11, 315–329. https://doi.org/10.1901/jaba.1978.11-315

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

To our knowledge, the extant research has focused on effects of models of exemplar performance only; more research is needed on the effects of including models of nonexemplar performance on training outcomes.

Effects of social interaction on leisure item preference and reinforcer efficacy for children with autism

Jan 5, 2026

Marissa E. Kamlowsky | Claudia L. Dozier | Stacha C. Leslie | Ky C. Kanaman | Sara C. Diaz de Villegas
Department of Applied Behavioral Science, University of Kansas, Lawrence, KS, USA

Correspondence
Claudia L. Dozier, Department of Applied Behavioral Science, University of Kansas, Lawrence, KS, 66045, USA.
Email: cdozier@ku.edu

Editor-in-Chief: John Borrero
Handling Editor: Craig Strohmeier

Abstract

We replicated and extended Kanaman et al. (2022) by comparing outcomes of solitary (leisure items only), social (leisure items with social interaction), and combined (leisure items alone and leisure items with social interaction) stimulus preference assessments to determine the extent to which the inclusion of social interaction influenced the outcomes of preference assessments for five children with autism. We then conducted reinforcer assessments to determine the reinforcing efficacy of high- and low-preferred leisure items when presented with and without social interaction. The results showed that both high- and low-preferred items functioned as reinforcers to varying degrees for all participants and the inclusion of social interaction increased the reinforcing efficacy of some items for all participants. Additionally, the results showed that combined preference assessments predicted reinforcer assessment outcomes for two of five participants but produced false-negative outcomes for three participants. Clinical implications and directions for future research are discussed.

Keywords: preference assessment, reinforcer assessment, reinforcer efficacy, social interaction

Stimulus preference assessments (SPAs) are commonly used to determine preferred items and activities that may be used as reinforcers to increase target behavior during clinical programming for individuals with autism and related disabilities (Hagopian et al., 2004). Numerous methods for conducting SPAs have been developed over the past several decades (see Saini et al., 2021, for a review), and most research has demonstrated that preference is a relatively reliable predictor of reinforcer efficacy (DeLeon et al., 2009; Piazza et al., 1996). The extent to which a stimulus functions as a reinforcer may be determined using a reinforcer assessment (Pace et al., 1985; Piazza et al., 1996), and researchers have found that high-preferred (HP) stimuli typically function as more effective reinforcers relative to low-preferred (LP) stimuli. That is, HP stimuli result in higher rates of responding, more persistent responding, or faster skill acquisition relative to LP stimuli (Penrod et al., 2008; Taravella et al., 2000). However, research has also shown that LP stimuli can function as reinforcers for some individuals or behaviors (N. M. Goldberg et al., 2023; Graff et al., 2006), particularly under singleoperant reinforcer arrangements (Roscoe et al., 1999). Given the clinical utility of both HP and LP stimuli as reinforcers (e.g., increased flexibility in the use of reinforcers of different qualities, decreased likelihood of satiation; Paden & Kodak, 2015), empirical investigations surrounding variables that influence preference and reinforcer efficacy are warranted.

Several variables may influence the outcomes of SPAs and reinforcer assessments. For example, motivating operations (i.e., conditions that momentarily alter the value of stimuli) such as the degree of deprivation or satiation associated with stimuli included, may influence SPA outcomes (Hanley et al., 2006). Hanley et al. (2006) measured shifts in preference with 10 adults with developmental delays following motivating-operation manipulations by providing extended daily access to HP items (a preference-weakening manipulation) and pairing access to LP items with social and edible reinforcers (a preference-strengthening manipulation). The researchers found that shifts in preference could be imposed by directly manipulating the motivating operations; however, the overall outcomes aligned with research on preference stability that suggests that hierarchies of preference remain generally stable over time (Carr et al., 2000; Hanley et al., 2006; MacNaul et al., 2021).

Another variable that may influence outcomes is the magnitude or duration of stimulus access. For example, Paden and Kodak (2015) evaluated the effects of varying reinforcement magnitude on skill acquisition outcomes and preference for reinforcers with four children with autism. All participants preferred the large-magnitude reinforcers; however, both large- and small-magnitude reinforcers produced rapid skill mastery. In another study, Jones et al. (2014) compared the preference stability of 11 typically developing children when the duration of access to HP items varied from 30 s to 5 min. Preferences remained generally stable across durations; however, for some participants preference for certain leisure items (e.g., videos) was influenced by the duration of access.

In addition to magnitude and duration, the types of items (i.e., stimulus categories) that are used may also influence outcomes. For example, Conine and Vollmer (2018) compared relative preferences for edible and leisure items in combined SPAs (i.e., SPAs involving stimuli from different categories) for 26 children with autism; they found that edible items were ranked higher than leisure items for most participants. However, the researchers observed less displacement of items (i.e., changes in rankings as a result of inclusion or exclusion of a stimulus category) than was observed in previous research (e.g., DeLeon et al., 1997; Fahmie et al., 2015). Specifically, Conine and Vollmer found that multiple leisure items outranked edible items during combined SPAs. The researchers speculated that leisure items may have outranked edible items because screen-based media were included (e.g., tablets, computers). This finding suggests that the type of stimulus category (e.g., edible, leisure) influences outcomes and that the inclusion of screen-based media within leisure-item SPAs may also influence hierarchies.

Along with edible, leisure, and screen-based stimulus categories, researchers have begun to evaluate the effects of social stimuli (e.g., social interaction, leisure items or activities presented with social interaction) on SPA outcomes. For example, N. M. Goldberg et al. (2023) demonstrated that when social interaction was evaluated as a separate stimulus category in a combined SPA, displacement of preferences by stimulus category occurred for three of five participants. M. C. Goldberg et al. (2017) evaluated preference for and reinforcing efficacy of activities that were presented with and without social interaction provided by participants’ mothers for 21 boys with autism. Results showed that participants preferred certain activities presented with social interaction at levels similar to those of typically developing peers. However, interpretation of these results is limited in that participants’ mothers provided all social interaction and the extent to which individuals with autism would prefer social activities when provided by other adults (e.g., staff or therapists) is unknown.

More recently, Kanaman et al. (2022) evaluated the effects of social interaction provided by an experimenter (i.e., classroom teacher) during leisure-item presentation. Specifically, the researchers compared preference for and reinforcer efficacy of leisure items that were presented with and without social interaction for 33 typically developing children during solitary (leisure items alone), social (leisure items with social interaction), and combined (leisure items alone and leisure items with social interaction) paired-stimulus preference assessments (PSPAs; Fisher et al., 1992). The researchers then conducted a concurrent-operants reinforcer assessment to determine the extent to which HP-leisure items identified from combined PSPAs predicted the results of reinforcer evaluations when presented with and without social interaction. The results of the solitary and social PSPAs indicated relatively stable preferences across participants; however, when leisure items were presented with and without social interaction in the same combined PSPA, the ranked order of the participants’ preferences shifted. Specifically, for most participants, preference and reinforcing efficacy of leisure items increased when items were presented with social interaction indicating that solitary and social stimuli are qualitatively different. However, there are some limitations worth noting.

First, Kanaman et al. (2022) conducted a concurrent-operants reinforcer assessment in which only the relative reinforcing efficacy of stimuli was assessed. In a concurrent-operants reinforcer assessment, participants may respond exclusively to a single task and the reinforcing effects of other stimuli (e.g., LP stimuli) may not be determined (Francisco et al., 2008). Alternatively, the use of a single-operant reinforcer assessment would allow for evaluation of the absolute reinforcing efficacy of each stimulus, which may provide information about the potency of LP stimuli (Francisco et al., 2008; Roscoe et al., 1999). Determining the reinforcing efficacy of LP items is important given that combined SPAs may be particularly susceptible to displacement effects, which may produce false-negative outcomes (DeLeon et al., 1997; N. M. Goldberg et al., 2023). That is, combined SPAs may suggest that specific stimuli or categories of stimuli are LP and less reinforcing. However, those LP stimuli may function as high-quality reinforcers, which would allow for increased flexibility and variety of stimuli used in programming (N. M. Goldberg et al., 2023). For example, after conducting combined PSPAs including edible, leisure, and social-interaction stimuli, N. M. Goldberg et al. (2023) conducted single-category reinforcer assessments and found that stimuli that were identified as LP produced levels of responding that were equal to those observed in the HP condition for two of five participants. Further, N. M. Goldberg et al. found that social interaction alone functioned as a reinforcer for four of five participants despite combined SPA outcomes indicating that social interaction was LP. These findings suggest the importance of evaluating LP stimuli that are identified from combined SPAs in subsequent reinforcer assessments. Second, Kanaman et al. used a relatively dense schedule of reinforcement (fixed-ratio [FR] 1 or 6), which did not reveal the extent to which HP stimuli would maintain reinforcing efficacy under progressively thinner schedules. Alternatively, using a progressive-ratio (PR) schedule of reinforcement in which the response requirement progressively increases within a single session may better highlight reinforcer efficacy (Francisco et al., 2008; Hodos, 1961; Roane, 2008; Roane et al., 2001).

Although results of M. C. Goldberg et al. (2017), N. M. Goldberg et al. (2023), and Kanaman et al. (2022) provide information about the influence of social interaction on displacement of preferences and reinforcing efficacy of leisure items and activities, the generality of these findings is somewhat limited. Specifically, all participants included in Kanaman et al. were typically developing children and all social interaction provided by M. C. Goldberg et al. (2017) was delivered by the participants’ mothers. Thus, the extent to which similar findings would be obtained with individuals who commonly participate in SPAs (e.g., children with autism) and those who program reinforcer arrangements (e.g., therapists or teachers) remains unknown.

Furthermore, it is especially important to extend these findings to children with autism given that they may exhibit difficulties in social–emotional reciprocity (e.g., failure to initiate or respond to social interaction) and may show less preference for social stimuli relative to neurotypical peers (American Psychiatric Association, 2013). However, recent research evaluating the value of social and nonsocial activities for individuals with autism suggests that social interactions can be both preferred and reinforcing for this population (e.g., Call et al., 2013; N. M. Goldberg et al., 2023; Gutierrez et al., 2013; Kelly et al., 2014; Morris & Vollmer, 2019; Nuernberger et al., 2012). Despite these findings, it remains possible that clinical providers for children with autism will assume that social interaction is not preferred and may avoid providing interactions during reinforcement periods. Furthermore, understanding the extent to which including social interaction enhances a reinforcer’s value may lead to more efficient programming for skill acquisition and reduction of challenging behavior. For example, if social interaction enhances the reinforcing efficacy of specific leisure items, clinicians may strategically modify differential reinforcement procedures to include social interaction.

Taken together, these findings suggest that social interaction may be preferred and reinforcing for some individuals with autism. However, continued research on preference for and reinforcing efficacy of solitary and social stimuli for children with autism is warranted to more thoroughly program learning opportunities and arrange environments that are both individualized and preferred. Therefore, the purpose of the current study was to replicate previous research by comparing the influence of social interaction on preference for leisure items within solitary (leisure items presented without social interaction), social (leisure items presented with social interaction), and combined (duplicate leisure items presented with and without social interaction in the same assessment) SPAs with children with autism. Additionally, we evaluated the reinforcing efficacy of both HP and LP stimuli presented with and without social interaction in a subsequent reinforcer assessment using a single operant arrangement with a PR schedule for most participants.

Method

Participants, setting, and materials

Five children who were diagnosed with autism were included in this study. All received their diagnoses from professionals who were not affiliated with the university based child development center that they attended, and all participants were enrolled in an early intervention program within the university-based child development center. Josie was a 5-year-old female who communicated vocally using full sentences and whose cultural identities included White and Afghan. Ophelia was a 3-year-old female who communicated vocally using one-to-two-word utterances and whose cultural identities included White and Nigerian. Andrew was a 5-year-old male who communicated vocally using one-to-three-word utterances and whose cultural identities included White and Hispanic. Masie was a 5-year-old female who communicated vocally using one-to-three-word utterances and whose cultural identities included White and Sri Lankan. Spencer was a 4-year-old White male who communicated vocally using one-to-three-word utterances.

Only children who met 80% accuracy in the preassessment (see below), which involved a brief assessment to determine whether they could discriminate the social interaction stimulus from other stimuli, were included; all recruited children met this criterion. Preference assessments for four children were conducted using a PSPA format; however, we conducted a free-operant preference assessment with response restriction (Hanley et al., 2003) for one participant (Spencer) due to the occurrence of challenging behavior when items were removed during previous attempts at the PSPA (see below for modifications to the free-operant assessment). A graduate student who was familiar to each participant (i.e., had known the participant for 6 months or more within the context of educational services, such as the supervisor in the participant’s classroom or adjacent classroom) served as the experimenter for all SPA sessions for a participant. The experimenters conducted all SPAs at a desk or on the floor of a session room at the child development center that contained a table, two chairs, and relevant session materials. Preference-assessment sessions were conducted once per day for a maximum of 30 min per day so as not to disrupt the participants’ early intervention programming until all SPA trials were complete. This period ranged from 1 to 3 weeks for all participants.

Materials for the solitary SPAs included four (Spencer only) or six different leisure items that were reported by caregivers and staff to be preferred for a participant (e.g., iPad, magnet letters, books, play food, dolls). Materials for the social SPAs included the same leisure items that were used in the solitary SPAs and two or four (Spencer only) pictures of the experimenter to be presented during the social SPA trials (i.e., one picture was presented with each leisure item that was presented with social interaction). The pictures were 30 21 cm in size and depicted the experimenter with a pleasant expression (e.g., smiling). Materials for the combined SPAs included duplicate sets of the leisure items that were used in the previous assessments and two pictures of the experimenter such that solitary and social leisure items could be included in the same assessment.

During the reinforcer assessment, session materials for Masie and Andrew included a binder with 20 laminated sheets of paper (22 28 cm). Each laminated paper depicted five squares (5 5 cm) with a single shape inside, which were used for a shape-matching task. A bin of laminated shape cutouts (5 5 cm) was used for the shape-matching task. Each cutout had a small piece of Velcro attached for ease of matching. A large number of laminated cutouts were included to ensure that participants had enough cutouts to match for the duration of the session without requiring the experimenter to reset the materials. For Josie and Ophelia, materials for the reinforcer assessment included the same bin of laminated shapes with corresponding bins for a sorting task. For most reinforcer assessments, we chose tasks that involved discrete responses and were similar to mastered educational tasks that were used in participants’ day-to-day programming. For Spencer, the reinforcer assessment involved separating the session room into three equidistant and concurrently available squares that were outlined on the floor with tape. Each square was approximately 1 1.5 m.

Additional session materials for all participants’ reinforcer assessments included the highest and lowest ranked leisure items from the combined SPA. That is, the leisure items (whether presented with or without social interaction) with the overall highest and lowest selection percentages from the combined SPAs were evaluated with and without social interaction in an HP and LP reinforcer assessment, respectively. If the HP leisure item from the combined SPA was the iPad, the experimenters evaluated both the iPad and the next highest ranked leisure item in separate reinforcer assessments. We chose to evaluate the reinforcing efficacy of both the iPad and the next-highest ranked leisure item in these circumstances based on previous research suggesting that screen-based media can influence preference hierarchies and potentially displace preference for other HP leisure items (Conine & Vollmer, 2018). Additionally, the reinforcer assessments included an alternative item or activity that was provided across all reinforcement sessions to ensure that the participant could engage in something other than the target response. The participant’s staff suggested the alternative item or activity as something that was continuously available in the participant’s classroom but not something hypothesized to be highly preferred (e.g., a puzzle or blocks always available in the free play area of the classroom). Finally, iPods were used to record sessions and for data collection.

Data collection and analysis

For all SPAs and reinforcer assessments, trained graduate and undergraduate observers collected data through a one-way observation booth or retroactively via videotaped sessions. Data were collected using paper data sheets and pencils or using iPods. For the four participants with whom we conducted PSPAs (all but Spencer), observers recorded participants’ leisure-item selection for each SPA trial. Leisure-item selection was defined as any instance in which the participant placed their hand on or pointed to a presented item within 5 s of its presentation. If the participant vocally selected (i.e., said the name of) an item, the experimenters prompted the participant to physically select the item by saying, “Point to the one you want.” Experimenters determined participants’ selection percentages for each item by summing the number of times an item was selected, dividing the result by the number of times that item was presented, and multiplying by 100. The experimenters then determined the rankings for the leisure items, which ranged from 1, indicating the item with the highest selection percentage, to 6 (solitary or social SPAs) or 12 (combined SPA), indicating the item with the lowest selection percentage. If two items produced equal selection percentages, the experimenters reviewed the raw data to determine which item was selected more frequently when paired with the other item and assigned the higher rank to the more frequently selected item.

For Spencer, with whom we conducted a free-operant preference assessment with response restriction, the observers recorded interactions with leisure items by using iPods for each SPA session. Leisure-item interaction was defined as any instance in which Spencer’s hand contacted any part of the item for at least 1 s with an immediate onset and offset. Leisure-item interaction was recorded as duration in seconds, and the experimenters determined Spencer’s interaction percentage for each leisure item by dividing the total number of seconds that Spencer interacted with each item by the total number of seconds in the session and multiplying the quotient by 100. The experimenters then determined the rankings for leisure items, which ranged from 1, indicating the item with the highest interaction percentage, to 4 (solitary or social SPAs) or 8 (combined SPA), indicating the item with the lowest interaction percentage.

During reinforcer assessments, the data collectors scored correct and incorrect responding toward the target task. For Masie and Andrew, the target task was shape matching. Correct matches were defined as any instance in which the participant physically placed a shape cutout onto the corresponding shape in the binder. Incorrect matches were defined as any instance in which the participant matched a shape cutout to any shape other than the corresponding shape in the binder. For Josie and Ophelia, the target task was shape sorting. Correct sorts were defined as any instance in which the participant physically placed a shape cutout into the corresponding shape bin. Incorrect sorts were defined as any instance in which the participant placed a shape cutout into any bin other than the corresponding shape bin. Observers collected data on the frequency of correct matches or sorts, the frequency of incorrect matches or sorts, the duration of reinforcer access, and the terminal PR schedule (i.e., break point) in each session. Reinforcer access was defined as the time in seconds from when the experimenter provided access to the leisure item and social interaction (if applicable) to the moment access was removed. Experimenters determined the terminal PR schedule by identifying the last PR requirement that was successfully completed by the participant. For Spencer, the target task was in-square behavior, defined as any instance in which Spencer’s body was within one of the outlined areas of the session room. Observers recorded the duration of in-square behavior in seconds with an immediate onset and offset.

To address limitations that have been described in previous research, the experimenters also collected data on social consumption and leisure-item engagement during all reinforcement sessions. Social consumption was defined as any verbal interaction with the experimenter (e.g., initiating conversation, reciprocating conversation, manding for interaction, or gesturing), non vocal interaction with the experimenter (e.g., reciprocating toy play, head nodding, or sharing toys), or orientation toward the experimenter (e.g., eye contact or participant’s face and body within approximately 1 m of the experimenter and face angled 90 degrees or less toward the experimenter). Leisure-item engagement was defined as any instance in which the participant made physical contact with the leisure item (e.g., spinning a spin toy, building with blocks, pushing a toy car). If engagement with a leisure item did not require physical manipulation (e.g., watching a video on an iPad), engagement was defined as looking at the leisure item. Social consumption and leisure-item engagement were both recorded as duration in seconds with an immediate onset and 3-s offset.

Interobserver agreement

A second observer independently collected data on the participants’ item selection or interaction (Spencer only) for all SPAs. For the selection of leisure items, the observers calculated trial-by-trial interobserver agreement by dividing the number of trials with agreement (i.e., both observers recorded the same selection) by the total number of trials and multiplying the result by 100 to obtain a percentage. For interaction, the observers calculated agreement for total duration by dividing the smaller duration by the larger duration of interaction in seconds and multiplying the result by 100 to obtain a percentage. Mean agreement for selection was 100% for solitary, social, and combined SPAs for Masie, Josie, and Andrew. For Ophelia, mean agreement was 100% for the solitary and social SPAs and 97.73% (range: 95.45%– 100%) for the combined SPA. Mean agreement for duration of Spencer’s interaction with leisure items was 98.1% (range: 84.1%–100%) for the solitary SPA, 99.6% (range: 97%–100%) for the social SPA, and 99.1% (range: 75%– 100%) for the combined SPA.

A second, independent observer also collected data on correct matches or sorts during reinforcement sessions. For Andrew, Masie, Ophelia, and Josie, agreement was calculated using the proportional agreement method. That is, each 10-min session was separated into 10-s intervals, and the experimenters compared correct responses recorded across observers within each interval. The experimenters divided the smaller number of recorded responses by the larger number of recorded responses within each interval, summed the results, divided this result by the total number of intervals, and multiplied by 100 to obtain a percentage of agreement. For Andrew, interobserver agreement was calculated for a mean of 45.5% (range: 41.7%–50%) of sessions and averaged 99.1% (range: 96.9%–100%) for the iPad reinforcer assessment, 96.8% (range: 95.5%–99%) for the HP reinforcer assessment, and 96.1% (range: 93.7%–98.8%) for the LP reinforcer assessment. For Masie, agreement was calculated for a mean of 42.4% (range: 40%–55.6%) of sessions and averaged 91.5% (range: 86.8%–96.6%) for the iPad reinforcer assessment, 93.6% (range: 86.8%– 98.8%) for the HP reinforcer assessment, and 97.6% (range: 96.5%–100%) for the LP reinforcer assessment. For Josie, agreement was calculated for a mean of 37.13% (range: 33.33%–40%) of sessions and averaged 95% (range: 89.2%–100%) for the iPad reinforcer assessment, 95.6% (range: 92.5%–98.4%) for the HP reinforcer assessment, and 98.1% (range: 95.6%–100%) for the LP reinforcer assessment. For Ophelia, agreement was calculated for a mean of 70.75% (range: 61.5%–80%) of sessions and averaged 98.9% (range: 96.7%–100%) for the HP reinforcer assessment and 99.3% (range: 97.9%– 100%) for the LP reinforcer assessment. For Spencer, a second and independent observer also collected data on in-square behavior during reinforcement sessions. The experimenters calculated agreement for total duration of in-square behavior by dividing the smaller duration by the larger duration in seconds and multiplying the result by 100 to obtain a percentage. For Spencer, agreement was calculated for 54.9% (range: 33.33%–76.5%) of sessions and averaged 95.17% (range: 93%–100%) for the HP reinforcer assessment and 99.67% (range: 98.4%– 100%) for the LP reinforcer assessment.

Design

The frequency of correct matches or sorts was evaluated across a baseline and PR reinforcer assessment for Andrew, Masie, and Josie. For all participants except Spencer (see procedural modifications below), we conducted single-operant reinforcer assessments that included baseline, HP, and LP phases. For Andrew, Masie, and Josie, an iPad phase was also included prior to the HP and LP phases and the PR schedule gradually increased within session (see procedures below). For Ophelia and Spencer, an FR schedule was used throughout the reinforcement evaluation. Specifically, for Ophelia, the frequency of correct sorts was evaluated across a baseline and the FR reinforcer assessment. We implemented an FR-1 schedule for Ophelia throughout the reinforcer assessment given this schedule was more akin to reinforcement schedules used during her clinical programming. For Spencer, the duration of in-square behavior was evaluated across an HP and LP concurrent-operants reinforcer assessment. We implemented a concurrent-operants reinforcer assessment for Spencer due to observed levels of challenging behavior when preferred items were unavailable.

Within each reinforcer assessment, conditions alternated in a multielement design across several evaluations. First, an iPad-solitary, iPad-social, and social-interaction-only condition rapidly alternated within the iPad reinforcer assessment (if applicable). Then, the HP solitary, HP social, and social-interaction-only conditions rapidly alternated within the HP reinforcer assessment. Finally, the LP solitary, LP social, and social-interaction-only conditions rapidly alternated within the LP reinforcer assessment. If similar responding was observed across reinforcement conditions, the experimenters reversed to baseline and replicated the phase.

Procedures

Preassessment

All participants completed a preassessment prior to SPAs to determine the extent to which they could discriminate between the presentation of solitary and social leisure items. The experimenter placed three picture cards equidistant on the floor or desk in front of the participant. One card depicted the experimenter with a pleasant expression to be used in the presentation of social leisure items. Another card depicted a common object (e.g., a book), and the third card was a blank control card. The experimenter instructed the participant to touch one of the picture cards by saying, “Touch the picture of me” for the social-interaction card or “Touch book” for the object card. If the participant selected the correct or incorrect card, the experimenter removed all cards from the array, rotated the cards, and presented the next trial. The order of instructions given by the experimenter was determined via a random-number generator for each participant, and no programmed consequences were provided for correct or incorrect card selection. All participants selected the correct card for a minimum of 80% of trials across 15 trials and, therefore, continued to the SPAs.

Stimulus preference assessments

The experimenters conducted three separate SPAs for all participants to determine stimulus rankings for solitary leisure items (i.e., items presented alone) and social leisure items (i.e., items presented with social interaction) when presented in separate assessments (solitary and social SPAs) as well as when presented in the same assessment (combined SPAs). Consistent with previous research, the order of all participants’ assessments was solitary, social, then combined SPAs to reduce the likelihood of a previous programmed history of social interaction paired with leisure items influencing solitary SPA outcomes (Kanaman et al., 2022). As mentioned, a PSPA was conducted with four participants and a free-operant with response restriction preference assessment was conducted with Spencer (see procedural modifications below).

For all PSPAs, the experimenter labeled each stimulus and provided presession access (see specific procedures below). Immediately following presession access, the experimenter began the PSPA trials by presenting the first two stimuli equidistant on the desk or floor in front of the participant. The experimenter labeled each stimulus and allowed the participant to make a selection. Once the participant selected an item by touching or gesturing toward it, the experimenter removed the nonselected stimulus and provided 30 s of access to the selected stimulus (i.e., leisure item by itself or leisure item with continuous social interaction). After 30 s of access, the experimenter removed the selected stimulus (i.e., removed the item and discontinued social interaction, if applicable) and presented the next two stimuli. If the participant attempted to select two stimuli simultaneously, the experimenter blocked the selection by removing both stimuli from the array and re-presented the trial. If the participant did not select either stimulus within 5 s of presentation, the experimenter removed both stimuli and re-presented the trial. If the participant still did not select a stimulus, the experimenter removed both stimuli, recorded, “No selection,” and moved to the next pair of stimuli. No prompts or other feedback were provided throughout participants’ SPAs. This process was repeated until each stimulus was paired with every other stimulus twice to allow for each to be presented on either side of the participant.

Solitary paired-stimulus preference assessments Solitary PSPAs were conducted to determine the extent to which participants preferred leisure items when presented alone. Presession access included the experimenter presenting each leisure item alone and vocally labeling the item by saying, “This is the (leisure item name) to play by yourself” while pointing to the item. Then, the experimenter allowed the participant to interact with the item by themselves for 30 s. Immediately following presession access, the experimenter implemented the PSPA procedures as described above. If the participant attempted to interact with the experimenter at any point during the solitary PSPA, the experimenter indicated that social interaction was unavailable by stating, “I can’t talk right now.” If the participant made continued attempts to interact with the experimenter, the experimenter withheld attention and provided no other programmed consequence (i.e., the experimenter remained in their current position but avoided eye contact and any other interaction with the participant). The experimenter repeated the presentation and selection procedures outlined above for the duration of the solitary PSPA.

Social paired-stimulus preference assessments Social PSPAs were conducted to determine the extent to which participants preferred leisure items when presented with social interaction. Similar to the procedure for the solitary PSPAs, the experimenter first provided presession access; however, each leisure item was presented with the picture of the experimenter. That is, the experimenter placed their picture and the leisure item in front of the participant, pointed to the picture and the leisure item, and said, “This is the (leisure item name) to play with me” before providing 30 s of access to the item with continuous social interaction. Social interaction in this condition consisted of conversation, comments, and play surrounding the presented item. For example, if the presented item was a book, the experimenter would read the book to the participant, describe the pictures, or have other conversation surrounding the story. The experimenter initially guided engagement and conversation with the leisure item but allowed the participant to lead if applicable (i.e., if the participant continued or initiated play or conversation surrounding the leisure item). All social interaction was contextual to the participant- and experimenter-led engagement with the leisure item that was presented. In addition, if the participant requested a different type of social interaction than was being provided (e.g., said, “Let’s play chase” when the book was presented), the experimenter redirected the participant to the available social interaction (e.g., the experimenter said, “We’re reading our book right now”). If the participant requested to end social interaction, the experimenter offered a different comment or interaction within the same context of play for that leisure item. Following presession access to each leisure item with social interaction, the experimenter implemented the presentation and selection PSPA procedures as described above for the duration of the social PSPA.

Combined paired-stimulus preference assessments

The combined PSPAs were conducted to determine the extent to which participants preferred leisure items when presented with social interaction (as done in social PSPAs) or without social interaction (as done in solitary PSPAs) within the same assessment. To allow a single leisure item to be presented in both a social and solitary manner, six leisure items plus duplicates of those same six items presented without social interaction were included. The experimenter first provided presession access by presenting each item (either with or without continuous social interaction, depending on the trial) in the same ways as described in the solitary and social PSPAs. The experimenter then presented the first two leisure items (either with or without pictures of the experimenter, depending upon the trial). Once the participant selected an item, the experimenter removed the non-selected item (and picture, if applicable) and provided 30 s of access to the selected item with or without continuous social interaction, depending on the selection. After 30 s of access, the experimenter removed the materials, discontinued social interaction (if applicable), and presented the next two items. The same procedures as were used in previous PSPAs were used in the combined PSPAs if participants attempted to select more than one item or did not select an item within 5 s of its presentation.

Free-operant with response restriction (Spencer only) For Spencer, all SPAs were conducted in a free-operant-with-response-restriction format using procedures that were similar to those that were reported in Hanley et al. (2003). The experimenter first provided presession access in the same ways as described in the PSPA procedures. When beginning the solitary SPA, the experimenter presented the four leisure items by themselves equidistant in a semicircle on the floor in front of Spencer. The experimenter labeled each item and informed Spencer he could play with one, some, or none of the items and allowed him to interact with any item that he approached for the duration of the 5-min session without providing social interaction. For the social SPA, the procedures were identical to those for the solitary SPA, with the addition of a picture of the experimenter when presenting leisure items and the continuous delivery of social interaction (as described in PSPAs) with each item interaction. That is, during the social SPA, the experimenter placed their picture directly above each leisure item in the array; if Spencer interacted with an item, the experimenter provided continuous social interaction. For the combined SPA, the procedures were identical to those for the solitary and social SPAs, with the addition of duplicate items. That is, four leisure items were presented alone and four duplicate leisure items were presented with the picture of the experimenter to signal the availability of social interaction in the same array. To determine preferences, the experimenter recorded the item with which Spencer interacted for the longest duration at the end of each session and then removed that item from the array in the subsequent session. Throughout the assessment, the experimenter would have blocked any instance in which Spencer attempted to select multiple items from the array simultaneously (e.g., a social and solitary item); however, this never occurred. Following each session, the experimenter rotated the order of the remaining items and continued with the next session until the assessment was complete.

Reinforcer assessments

During the solitary conditions of the reinforcer assessment, reinforcement consisted of contingent access to the HP or LP leisure item that was identified from the combined SPA by itself (i.e., without social interaction). If the participant attempted to interact with the experimenter during a solitary condition, the experimenter stated, “I can’t talk right now.” During social conditions, reinforcement consisted of contingent access to the HP or LP leisure item from the combined SPA with continuous social interaction from the experimenter. During iPad, HP, and LP social conditions, social interaction consisted of comments, conversation, and play all relevant to the leisure activity or iPad (i.e., the same social interaction provided with social stimuli during SPAs). As in the SPAs, the experimenter initially guided then followed the participant’s lead with respect to conversation and play with the leisure items. During the social-interaction-only control condition, social interaction included general comments about the participant’s environment (i.e., neutral conversation) and a pleasant facial expression.

At the beginning of each reinforcer assessment session, the experimenter placed the target task materials and the alternative task materials equidistant on the table in front of the participant. The experimenter then provided a rule and presession exposure to the contingency in that condition using a three-step prompting sequence. During baseline sessions, no programmed consequences were delivered for correct responding. For Josie, Andrew, and Masie, correct responding was reinforced on a PR schedule. We doubled the PR schedule following the completion of two response requirements at a particular schedule (e.g., FR 1, FR 1, FR 2, FR 2, FR 4, FR 4, FR 8, FR 8) within the session to produce 1 min of reinforcement associated with that condition (Jarmolowicz & Lattal, 2010; Roane, 2008), and the PR schedule reset to FR 1 at the start of each new session (Harper et al., 2021). For Ophelia, correct responding was reinforced on an FR-1 schedule in which one correct response was required to produce 1 min of reinforcement associated with that condition throughout session. All reinforcer-assessment sessions ended after 2 min without responding toward the target task or after 10 min elapsed, whichever came first. Additionally, the duration of reinforcer delivery was removed from the total session time to control for opportunities to respond across sessions. The order of all sessions for each participant was quasi random. Specifically, prior to conducting sessions, the experimenter wrote the names of each condition on separate pieces of paper and placed all pieces of paper into a bowl. Then, the experimenter selected one piece of paper at a time and conducted sessions in the selected order for each phase of the reinforcer assessment.

Baseline

During baseline sessions, neither the preferred items nor the pictures of the experimenter were available. At the start of session, the experimenter provided presession exposure to the response requirement by first vocally informing the participant about the task and the consequence for that condition (e.g., “If you [match or sort], nothing happens”). Then, the experimenter prompted the participant to match or sort one shape and provided no programmed consequence. The participant was free to match (or sort) shapes or engage with the alternative task materials for the duration of the session.

iPad (solitary and social)

During the solitary and social conditions with the iPad, the iPad, target task materials, picture of the experimenter (social condition only), and alternative task materials were placed in front of the participant. The experimenter first vocally informed the participant about the task and the consequence for that condition (e.g., “If you match [or sort], you get to play the iPad by yourself”; “If you match [or sort], you get to play the iPad with me”). The experimenter then prompted the participant to match [or sort] one shape and delivered the iPad for 1 min with or without social interaction, depending on the condition. Then, the experimenter removed access to the iPad (and social interaction, if applicable). Following presession exposure, the experimenter started the session timer, withheld access to the iPad and social interaction, and allowed the participant to respond. Contingent on one correct match [or sort], the experimenter delivered 1 min of access to the iPad without social interaction (solitary) or with social interaction (social). After 1 min, the experimenter removed access to the iPad (and social interaction, if applicable) and allowed another opportunity to respond. For individuals under the PR schedule, the response requirement increased as previously described. For Ophelia, the response requirement remained at FR 1 for the duration of the session. Participants were free to match (or sort) shapes or engage with the alternative task materials for the duration of the session.

High preferred (solitary and social)

During the HP solitary and social conditions, the HP leisure item, target task materials, picture of the experimenter (social condition only), and alternative task materials were placed in front of the participant. The experimenter first vocally informed the participant of the task and the consequence for that condition (e.g., “If you match [or sort], you get to play [name of HP leisure item] by yourself”; “If you match [or sort], you get to play [name of HP leisure item] with me”). The remaining procedures were identical to those described in the iPad reinforcer assessment, except that the HP item was delivered in place of the iPad.

Low preferred (solitary and social)

During the LP solitary and social conditions, sessions were identical to those for the HP conditions described above except the LP leisure item was used.

Social-interaction-only control

During the social-interaction-only control condition, only the picture of the experimenter was placed in front of the participant. The experimenter first vocally informed the participant about the task and the consequence for that condition (e.g., “If you match [or sort], you get to talk to me”). The experimenter prompted the participant to match (or sort) one shape and provided 1 min of neutral social interaction. Next, the experimenter removed access to social interaction, started the session timer, and allowed the participant to respond. Contingent on one correct match (or sort), the experimenter delivered 1 min of continuous access to the same type of social interaction that was delivered in presession exposure (i.e., neutral conversation and general comments about the participant’s environment with a pleasant facial expression). After 1 min, the experimenter removed access to social interaction and allowed another opportunity to respond. The response requirement increased within session as described in previous conditions, and the participant was free to match (or sort) shapes or engage with the alternative task materials for the duration of the session.

Procedural modifications (Spencer)

As previously mentioned, Spencer’s reinforcer assessment was conducted using a concurrent-operants arrangement due to observed levels of Spencer’s challenging behavior when preferred items were unavailable. To conduct Spencer’s assessment, the session room was first separated into three concurrently available and condition-specific areas. Specifically, the room was separated into three large and equidistant squares (approximately 1 1.5 m) by placing tape on the floor to segment each area. The areas were arranged such that Spencer had enough space to walk between each square, stand within each square, and stand outside of the squares. Prior to each session, the experimenter arranged each square to depict solitary, social, and social-interaction-only conditions. That is, one square contained the HP or LP item (depending on whether it was an HP or LP condition) by itself (solitary), another square contained the HP or LP item and the picture of the experimenter (social), and the last square contained only the picture of the experimenter (social interaction only).

At the start of each session, the experimenter vocally informed Spencer about the condition and provided presession exposure to the stimulus in each square. That is, the experimenter prompted Spencer to step into each square and provided 30 s of access to the solitary, social, and social-interaction-only conditions. The experimenter informed Spencer that he could go back and forth between the squares or enter none of the squares. Contingent on Spencer entering one of the squares (i.e., once Spencer fully stepped into a square), the experimenter provided continuous access to the HP or LP item (with or without social interaction, depending upon the condition) or continuous access to neutral social interaction (social interaction only) for the duration that Spencer remained in the square. If Spencer entered none of the squares, the experimenter stood in the corner of the room and refrained from providing social interaction. The experimenter continued these procedures for the duration of the session. Following each completed session, the experimenter rotated the location of each stimulus within each square until the evaluation was complete.

Procedural fidelity

During the SPAs, secondary observers collected data on correct stimulus delivery (i.e., leisure item delivery with or without social interaction, depending on the programmed stimulus) during all SPA trials across participants. During PSPAs, observers scored stimulus delivery as correct if the experimenter correctly delivered the selected leisure item and social interaction (if applicable) on a trial. The observer scored the delivery of stimuli as incorrect if the experimenter failed to deliver a programmed stimulus for a selection or delivered the stimulus in a way that was different from what was programmed (e.g., delivered social interaction following selection of a solitary leisure item, delivered social interaction in a way that differed from the definition). For Spencer, the observer scored stimulus delivery as correct if the experimenter correctly provided access to the leisure item with which Spencer interacted and social interaction (if applicable) for the duration of Spencer’s interaction. The observer scored stimulus delivery as incorrect if the experimenter failed to provide access to the selected stimulus for the duration of Spencer’s interaction or provided access to a stimulus in a way that was different than programmed. Observers recorded correct or incorrect stimulus delivery following each selection or period of interaction and divided the number of opportunities with correct stimulus delivery by the total number of selections to obtain a percentage. The mean percentage of correct stimulus delivery was 100% for solitary, social, and combined SPAs for Masie, Josie, Ophelia, and Spencer. For Andrew, the mean percentage of correct stimulus delivery was 97% (range: 93.33%–100%) for the solitary SPA and 100% during social and combined SPAs.

During the reinforcer assessments, data that were collected on target responding and reinforcer access were used to calculate procedural fidelity across participants. Data collectors scored procedural fidelity as correct if the experimenter delivered the condition specific stimulus (i.e., leisure item only, leisure item with social interaction, or social interaction only) in the correct way (i.e., presence or absence of leisure item with or without social interaction, depending upon the condition) within 3 s of completion of the current PR requirement (i.e., when the required number of correct matches or sorts under the current schedule was completed) or within 3 s of entering the square (Spencer only). Data collectors scored procedural fidelity as incorrect if the experimenter incorrectly delivered the condition-specific stimulus (e.g., delivered a leisure item with social interaction during a solitary condition) or delivered no stimulus within 3 s of completion of the response requirement. Data collectors also scored procedural fidelity as incorrect if the experimenter delivered a stimulus at any time other than at the completion of the response requirement (e.g., delivered the item after seven correct matches during the PR-8 schedule requirement). The experimenters divided the number of correct stimulus deliveries by the total number of opportunities and multiplied by 100 to obtain a percentage. For Andrew, procedural fidelity was assessed for a mean of 61.07% (range: 58.3%–66.6%) of sessions and averaged 97.6% (range: 83%–100%) for the iPad reinforcer assessment, 97.9% (range: 85.7%– 100%) for the HP reinforcer assessment, and 100% for the LP reinforcer assessment. For Masie, procedural fidelity was assessed for a mean of 42.4% (range: 40%– 55.6%) of sessions and averaged 83.8% (range: 70%– 100%) for the iPad reinforcer assessment, 95.0% (range: 87.5%–100%) for the HP reinforcer assessment, and 97.1% (range: 85.7%–100%) for the LP reinforcer assessment. For Josie, procedural fidelity was assessed for a mean of 43.63% (range: 33.33%–50%) of sessions and averaged 94.2% (range: 70%–100%) for the iPad reinforcer assessment, 98.9% (range: 90%–100%) for the HP reinforcer assessment, and 98.2% (range:

90.9%–100%) for the LP reinforcer assessment. For Ophelia, procedural fidelity was assessed for a mean of

93.35% (range: 86.7%–100%) of sessions and averaged 96.9% (range: 85.7%–100%) for the HP reinforcer assessment and 99.2% (range: 93.33%–100%) for the LP reinforcer assessment. For Spencer, procedural fidelity was assessed for a mean of 83.35% (range: 66.7%–100%) of sessions and averaged 93.26% (range: 87%–100%) for the HP reinforcer assessment and 91.4% (range: 77%–100%) for the LP reinforcer assessment. Across participants, errors in procedural fidelity were observed during relatively high PR requirements (e.g., PR 16, PR 32) when the experimenter miscounted the participant’s matches or sorts and delivered the reinforcer prior to or soon after the response requirement was met (e.g., the experimenter delivered the reinforcer following 33 instead of 32 sorts).

Results

Figure 1 displays the results of the solitary, social, and combined SPAs for Josie, Ophelia, and Andrew given the similarities in their outcomes. Specifically, participants’ preference rankings remained relatively stable across assessments and the same two items were ranked “1” and “2” across solitary and social assessments for these participants. Additionally, for participants for whom the iPad was included (Josie and Andrew), the iPad was ranked “1” across all assessments. However, the combined SPA data showed some displacement of preference rankings across participants. For Josie, all items that were presented with social interaction were ranked higher than items that were presented without social interaction in the combined SPA. For Ophelia, almost all leisure items that were presented without social interaction were ranked higher than those that were presented with social interaction in the combined SPA. For Andrew, combined SPA outcomes were mixed; however, several leisure items that were presented without social interaction were ranked higher than items that were presented with social interaction.

Figure 2 displays the results of solitary, social, and combined SPAs for Masie and Spencer. Similar to the rankings of the other participants, the same two items were ranked “1” and “2” across solitary and social assessments and participants’ preference hierarchies remained relatively stable across all three assessments. For Masie, the iPad was ranked as “1” across all assessments, like other participants for whom the iPad was included. For both participants, the combined SPA data showed mixed preferences for solitary and social stimuli. That is, preference rankings of solitary and social stimuli depended on the leisure item.

Figure 3 displays the results of the reinforcer assessments for Andrew, Masie, and Ophelia given similarities in their outcomes during one or more social conditions.

FIGURE 1 Stimulus preference assessment results for Josie, Ophelia, and Andrew showing overall higher (Josie and Ophelia) or lower (Andrew) preference for a single stimulus category. SI = social interaction.

FIGURE 2 Stimulus preference assessment results for Masie and Spencer showing mixed preferences across stimulus categories. SI = social interaction.

FIGURE 3 Reinforcer assessment results for Andrew, Masie, and Ophelia showing differentiated responding in one or more social conditions. BL = baseline; SR = reinforcement; HP = high preferred; LP = low preferred; SI = social interaction; “a” indicates sessions that were terminated due to lack of responding.

Specifically, Andrew and Masie engaged in low levels of correct responding in baseline and higher levels of correct responding in the solitary and social test conditions relative to the social-interaction-only control condition during the iPad phase. Given that we observed overall higher levels of correct responding and differentiation between both test conditions relative to the control condition with limited variability, we moved to the HP-item phase. During this phase, Andrew and Masie engaged in higher levels of correct responding during the social test condition relative to the solitary test condition and differentiation was observed between test conditions and the social-interaction-only control condition. Finally, during the LP-item phase, Andrew continued to engage in high levels of correct responding in the social test condition relative to the solitary test condition and social-interaction-only control condition, with a high level of differentiation; however, Masie engaged in low levels of correct responding across all conditions, with minimal variability. For Masie, these results corresponded to her combined SPA. For Andrew, the combined SPA results were mixed; however, the reinforcer assessment results show that Andrew completed more responses when access to the HP and LP items was presented with social interaction. These results suggest that Andrew’s combined SPA produced at least a partial false-negative outcome (DeLeon et al., 1997; N. M. Goldberg et al., 2023).

Ophelia engaged in low levels of correct responding in baseline with minimal variability and higher levels of correct responding during both solitary and social test conditions relative to the social-interaction-only control condition in the HP-item phases. During the first LP-item phase, Ophelia engaged in much higher levels of correct responding during the social test condition relative to the solitary test condition and social-interaction-only control condition, suggesting that social interaction increased the reinforcing value of the LP item. These results suggest a false-negative outcome for Ophelia’s combined SPA given that almost all solitary items outranked social items. In Ophelia’s second HP-item phase, we observed a higher degree of variability; however, levels of responding across all conditions were similar to those that were observed during the first HP-item phase. During the second LP-item phase, Ophelia engaged in much higher levels of correct responding during both social and solitary test conditions relative to the social-interaction-only control condition; however, high levels of correct responding were observed in the final social-interaction-only control condition relative to previous control conditions. Overall, Ophelia’s responding during the social test condition in the second LP-item phase mirrored her responding during the first LP-item phase, whereas her responding during the solitary test condition was much higher in the second LP-item phase than it was in the first LP-item phase.

Figure 4 displays the results of the reinforcer assessment for Josie. Josie engaged in low levels of correct responding in baseline and high levels of correct responding across all conditions (including the social-interaction-only control condition) in the iPad and HP item phases with slightly higher levels of correct responding in the social test conditions relative to solitary and social-interaction-only conditions. In the LP-item phase, Josie engaged in higher levels of correct responding in the social and solitary test conditions relative to the social-interaction-only control condition. These results partially correspond to Josie’s combined SPA, which showed that all stimuli that were presented with social interaction outranked those that were presented without social interaction. Overall, the results of Josie’s reinforcer assessment suggest that leisure items that are presented with and without social interaction functioned as effective reinforcers.

FIGURE 4 Reinforcer assessment results for Josie. BL = baseline; SR = reinforcement; HP = high preferred; LP = low preferred; SI = social interaction; “a” indicates sessions that were terminated due to lack of responding

Figure 5 displays the results of the concurrent-operants reinforcer assessment for Spencer. During both the HP- and LP-item phases, Spencer allocated the majority of session time to either the solitary or social test conditions relative to the social-interaction-only control condition, with more time spent in the social test condition overall. Preferences for treatment conditions relative to control conditions are not uncommon (e.g., Dozier et al., 2007), and these results correspond to Spencer’s combined SPA outcomes for the HP item. However, Spencer’s combined SPA data indicated that the LP item was less preferred when presented with social interaction. Similar to Andrew and Ophelia, the results of Spencer’s reinforcer assessment suggest a false-negative outcome in that Spencer allocated more responding toward the LP social test condition relative to the LP solitary test condition despite a lower ranking of the LP social item in the combined SPA.

FIGURE 5 Reinforcer assessment results for Spencer. SR = reinforcement; HP = high preferred; LP = low preferred; SI = social interaction

Figures 6 and 7 display participants’ data for social consumption and leisure-item engagement, respectively. The data for social consumption show that participants generally consumed the social interaction that was provided for most of the sessions across all conditions. Additionally, some participants showed higher levels of social consumption in social test conditions relative to the social-interaction-only control condition (e.g., Andrew during the iPad, HP, and LP conditions; Masie during the HP social condition). The data for leisure-item engagement show varying levels of engagement across conditions and participants. For example, Andrew’s data show high levels of engagement in both solitary and engagement during the social condition than during the social conditions during the iPad and HP phases; how- solitary condition. Similar results are observed in Masie’s ever, during the LP phase, Andrew showed much higher and Ophelia’s HP and LP phases. For the other participants (i.e., Josie, Spencer), high levels of leisure item engagement were observed during the social and solitary conditions of all reinforcer assessment phases.

FIGURE 6 Social consumption. HP = high preferred; LP = low preferred; SI = social interaction. White and gray bars represent the mean of values. Black error bars represent the range of values.

FIGURE 7 Leisure-item engagement data. HP = high preferred; LP = low preferred. White and gray bars represent the mean of values. Black error bars represent the range of values.

Discussion

Overall, the results of the current study replicated and extended those of previous research by demonstrating the influence of social interaction on preference and reinforcing efficacy of leisure items for children with autism. For all participants, both HP and LP items that were identified from the combined SPA functioned as reinforcers and social interaction increased the reinforcing efficacy of one or more leisure items. Interestingly, the combined SPA produced false-negative outcomes for three participants in that items that were identified as LP functioned as reinforcers, producing the same or higher levels of responding as HP items did when the LP items were presented with social interaction. This finding replicates the results of N. M. Goldberg et al. (2023), indicating that combined SPAs that include social interaction as a stimulus category may not fully capture the reinforcing efficacy of LP social stimuli.

These findings extend research on variables that influence preference for leisure items and reinforcer efficacy for children with autism in several ways. First, we found that the iPad functioned as a reinforcer regardless of the provision of social interaction for all participants for whom the iPad was included. This finding supports those of Conine and Vollmer (2018), suggesting that the inclusion of screen-based media influences preference hierarchies. Second, the HP item from the combined SPA functioned as an effective reinforcer for all participants and the reinforcing value of the HP item increased when presented with social interaction for four participants. Interestingly, only the reinforcing value of the LP item increased when presented with social interaction for Ophelia. This finding suggests that social interaction likely increases the value of some leisure items; however, certain items may remain more preferred and reinforcing when presented alone. Therefore, continued research is warranted to determine variables that affect the reinforcing function of social interaction within the context of leisure activities. Third, combined SPAs failed to accurately predict the reinforcing value of HP and LP items (presented with and without social interaction) for some participants. For example, almost all solitary stimuli outranked social stimuli in Ophelia’s combined SPA; however, Ophelia engaged in more correct responses to access the LP item with social interaction. Additionally, Ophelia’s responding to access the LP solitary item increased substantially from the first LP item phase to the second LP item phase. Anecdotally, the experimenters observed Ophelia playing with the LP item by herself in ways she had not during the first LP solitary phase (e.g., she began imitating the experimenter’s vocalizations and motions with the LP item). Therefore, it is possible that Ophelia’s increased level of responding in the second LP solitary phase was a result of learning. This finding suggests that exposure to social interaction with LP items can function to condition preferences and increase the reinforcing value of items by teaching new ways to engage with items (Hanley et al., 2006). Fourth, the order of all participants’ SPAs was solitary, social, then combined to decrease the potential influence of social SPAs on subsequent solitary SPAs; however, researchers may consider changing that order in future investigations to evaluate the stability of preferences following exposure to leisure items with social interaction.

Although results from the current study extend the literature in several ways, there are limitations worth noting. First, we did not control for access to or restriction of HP items, LP items, or social interaction outside of participants’ research sessions due to clinical programming needs. It is possible that stimulus access outside of research sessions influenced motivation for reinforcers (Hanley et al., 2006; Vollmer & Iwata, 1991). Second, although we used target tasks that were similar to those that are used in participants’ clinical programming, the extent to which similar reinforcement effects would be obtained with other responses (e.g., more complex curriculum tasks, activities of daily living) is unknown. Thus, future research might involve evaluating the reinforcing efficacy of leisure items that are presented with and without social interaction within the context of skill acquisition (i.e., novel tasks) and mastered tasks. It is possible that reinforcing efficacy depends on not only the degree of preference and inclusion of social interaction but also the difficulty associated with the task and the presence or absence of a reinforcement history for the target response. Relatedly, the PR schedules that were used during Andrew, Masie, and Josie’s reinforcer assessments all started at PR 1 and increased (i.e., progressively doubled) within session; thus, the extent to which similar results would be obtained if we had started sessions with a less dense schedule of reinforcement (e.g., PR 10) or maintained a consistent schedule (e.g., FR 10) within sessions is unknown. In the future, researchers may evaluate the influence of social interaction on reinforcement effects using schedules of reinforcement that are commonly implemented in early intervention or classroom settings (e.g., completion of one worksheet to access reinforcement).

Third, although the social interaction provided during social conditions was programmed to be item specific, this did not allow the researchers to evaluate preference or reinforcing efficacy of other types of social interaction. Anecdotally, some participants (Andrew, Ophelia, and Spencer) occasionally requested non-programmed forms of social interaction (e.g., tickles, bounces, chase). Given that the experimenters did not provide non-programmed interactions, the extent to which individually identified and preferred forms of social interaction would have influenced outcomes remains unknown. In the future, researchers may include individualized forms of social interaction as a point of comparison. Furthermore, all sessions were conducted by the same experimenter for each participant. Therefore, it is unknown whether social interaction provided by other individuals (e.g., peers, less familiar adults) would be equally preferred and reinforcing. Thus, future research might involve investigating other parameters of interactions (e.g., provider, duration, presence or absence of instructions and modeling, compliance with requests).

Overall, results indicate that preference for and reinforcing efficacy of leisure items are influenced by the inclusion of social interaction for children with autism and that the degree of influence may be idiosyncratic. That is, for some individuals, social interaction may increase preference for and reinforcing efficacy of many or all leisure items; for other individuals, social interaction may increase preference for and reinforcing efficacy of only certain leisure items. In the future, researchers may conduct quantitative analyses to further describe the degree of displacement from solitary and social SPAs to combined SPAs or conduct correspondence analyses between SPAs and reinforcer-assessment outcomes to facilitate a detailed interpretation of the results. Thus far, our findings suggest that clinicians who are programming leisure items and activities as reinforcers should consider (a) their presentation during SPAs (i.e., whether leisure items are presented with or without social interaction during the assessment) and (b) the presentation of leisure items as programmed reinforcers (i.e., whether the leisure items are presented with or without social interaction when delivered following a target behavior for increase). Furthermore, given that we observed higher levels of engagement with leisure items during social conditions relative to those observed during solitary conditions for several participants, clinicians should strongly consider providing social interaction during reinforcer-access periods to promote item engagement and potentially condition social interaction to be reinforcing. Additionally, presenting leisure items and activities with social interaction may enhance the social and ecological validity of reinforcement procedures given that interactions commonly occur in the natural environment. Furthermore, the incorporation of social interaction may be a useful procedural adaptation to differential reinforcement procedures that are used for skill acquisition or to decrease challenging behavior, particularly for individuals whose challenging behavior is maintained by access to attention or tangibles. It is likely worthwhile for clinicians to determine which items are more reinforcing when presented with and without social interaction to program reinforcement most effectively.

Acknowledgments

We would like to thank McKenna Reilly, Abigail Rains, and Nicole Dowell for their assistance with data collection.

Conflict of Interest Statement

The authors have no conflicts of interest to disclose regarding the current manuscript.

Data Availability Statement

Data are available from the corresponding author upon request.

Ethics Approval

This study received institutional review board approval and was conducted in accordance with established ethical guidelines for the treatment of human participants. Caregivers provided informed consent for all participants. Prior to all sessions, assent from the participant was required.

References

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425596

Call, N. A., Shillingsburg, M. A., Bowen, C. N., Reavis, A. R., & Findley, A. J. (2013). Direct assessment of preferences for social interactions in children with autism. Journal of Applied Behavior Analysis, 46(4), 821–826. https://doi.org/10.1002/jaba.69

Carr, J. E., Nicolson, A. C., & Higbee, T. S. (2000). Evaluation of a brief multiple-stimulus preference assessment in a naturalistic context. Journal of Applied Behavior Analysis, 33(3), 353–357. https://doi.org/10.1901/jaba.2000.33-353

Conine, D. E., & Vollmer, T. R. (2018). Relative preferences for edible and leisure stimuli in children with autism. Journal of Applied Behavior Analysis, 52(2), 557–573. https://doi.org/10.1002/jaba.525

DeLeon, I. G., Frank, M. A., Gregory, M. K., & Allman, M. J. (2009). On the correspondence between preference assessment outcomes and progressive-ratio schedule assessments of stimulus value. Journal of Applied Behavior Analysis, 42(3), 729–733. https://doi.org/10.1901/jaba.2009.42-729

DeLeon, I. G., Iwata, B. A., & Roscoe, E. M. (1997). Displacement of leisure reinforcers by food during preference assessments. Journal of Applied Behavior Analysis, 30(3), 475–484. https://doi.org/10.1901/jaba.1997.30-475

Dozier, C. L., Vollmer, T. R., Borrero, J. C., Borrero, C. S., Rapp, J. T., Bourret, J., & Gutierrez, A. (2007). Assessment of preference for behavioral treatment versus baseline conditions. Behavioral Interventions, 22(3), 245–261. https://doi.org/10.1002/bin.241

Fahmie, T. A., Iwata, B. A., & Jann, K. E. (2015). Comparison of edible and leisure reinforcers. Journal of Applied Behavior Analysis, 48(2), 331–343. https://doi.org/10.1002/jaba.200

Fisher, W., Piazza, C. C., Bowman, L. G., Hagopian, L. P., Owens, J. C., & Slevin, I. (1992). A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilities. Journal of Applied Behavior Analysis, 25(2), 491–498. https://doi.org/10.1901/jaba.1992.25-491

Francisco, M. T., Borrero, J. C., & Sy, J. R. (2008). Evaluation of absolute and relative reinforcer value using progressive-ratio schedules. Journal of Applied Behavior Analysis, 41(2), 189–202. https://doi.org/10.1901/jaba.2008.41-189

Goldberg, M. C., Allman, M. J., Hagopian, L. P., Triggs, M. M., Frank-Crawford, M. A., Mostofsky, S. H., Denckla, M. B., & DeLeon, I. G. (2017). Examining the reinforcing value of stimuli within social and non-social contexts in children with and without high-functioning autism. Autism, 21(7), 881–885. https://doi.org/10.1177/1362361316655035

Goldberg, N. M., Roscoe, E. M., Newman, Z. A., & Sedano, A. J. (2023). Single- vs. combined-category preference assessments for edible, leisure, and social-interaction stimuli. Journal of Applied Behavior Analysis, 56(4), 787–803. https://doi.org/10.1002/jaba.1007

Graff, R. B., Gibson, L., & Galiatsatos, G. T. (2006). The impact of high- and low-preference stimuli on vocational and academic performances of youths with severe disabilities. Journal of Applied Behavior Analysis, 39(1), 131–135. https://doi.org/10.1901/jaba.2006.32-05

Gutierrez, A., Fischer, A. J., Hale, M. N., Durocher, J. S., & Alessandri, M. (2013). Differential response patterns to the control condition between two procedures to assess social reinforcers for children with autism. Behavioral Interventions, 28(4), 353–361. https://doi.org/10.1002/bin.1372

Hagopian, L. P., Long, E. S., & Rush, K. S. (2004). Preference assessment procedures for individuals with developmental disabilities. Behavior Modification, 28(5), 668–677. https://doi.org/10.1177/0145445503259836

Hanley, G. P., Iwata, B. A., Lindberg, J. S., & Conners, J. (2003). Response-restriction analysis: I. Assessment of activity preferences. Journal of Applied Behavior Analysis, 36(1), 47–58. https://doi.org/10.1901/jaba.2003.36-47

Hanley, G. P., Iwata, B. A., & Roscoe, E. M. (2006). Some determinants of changes in preference over time. Journal of Applied Behavior Analysis, 39(2), 189–202. https://doi.org/10.1901/jaba.2006.163-04

Harper, A. M., Dozier, C. L., Briggs, A. M., Diaz de Villegas, S., Ackerlund Brandt, J. A., & Jowett Hirst, E. S. (2021). Preference for and reinforcing efficacy of different types of attention in preschool children. Journal of Applied Behavior Analysis, 54(3), 882– 902. https://doi.org/10.1002/jaba.814

Hodos, W. (1961). Progressive ratio as a measure of reward strength. Science, 134(3483), 943–944. https://doi.org/10.1126/science.134.3483.943

Jarmolowicz, D. P., & Lattal, K. A. (2010). On distinguishing progressively increasing response requirements for reinforcement. The Behavior Analyst, 33(1), 119–125. https://doi.org/10.1007/bf03392207

Jones, B. A., Dozier, C. L., & Neidert, P. L. (2014). An evaluation of the effects of access duration on preference assessment outcomes. Journal of Applied Behavior Analysis, 47(1), 209–213. https://doi.org/10.1002/jaba.100

Kanaman, N. A., Hubbs, A. L., Dozier, C. L., Jones, B. A., Foley, E., & Ackerlund Brandt, J. (2022). Evaluating the effects of social interaction on the results of preference assessments for leisure items. Journal of Applied Behavior Analysis, 55(2), 430–450. https://doi.org/10.1002/jaba.897

Kelly, M. A., Roscoe, E. M., Hanley, G. P., & Schlichenmeyer, K. (2014). Evaluation of assessment methods for identifying social reinforcers. Journal of Applied Behavior Analysis, 47(1), 113–135. https://doi.org/10.1002/jaba.107

MacNaul, H., Cividini-Motta, C., Wilson, S., & Di Paola, H. (2021). A systematic review of research on stability of preference assessment outcomes across repeated administrations. Behavioral Interventions, 36(4), 962–983. https://doi.org/10.1002/bin.1797

Morris, S. L., & Vollmer, T. R. (2019). Assessing preference for types of social interaction. Journal of Applied Behavior Analysis, 52(4), 1064–1075. https://doi.org/10.1002/jaba.597

Nuernberger, J. E., Smith, C. A., Czapar, K. N., & Klatt, K. P. (2012). Assessing preference for social interaction in children diagnosed with autism. Behavioral Interventions, 27(1), 33–44. https://doi.org/10.1002/bin.1336

Pace, G. M., Ivancic, M. T., Edwards, G. L., Iwata, B. A., & Page, T. J. (1985). Assessment of stimulus preference and reinforcer value with profoundly retarded individuals. Journal of 260 KAMLOWSKY ET AL. 19383703, 2025, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jaba.2919 by Behavior Analyst Certification Board, Wiley Online Library on [14/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Applied Behavior Analysis, 18(3), 249–255. https://doi.org/10.1901/jaba.1985.18-249

Paden, A. R., & Kodak, T. (2015). The effects of reinforcement magnitude on skill acquisition for children with autism. Journal of Applied Behavior Analysis, 48(4), 924–929. https://doi.org/10.1002/jaba.239

Penrod, B., Wallace, M. F., & Dyer, E. J. (2008). Assessing potency of high- and low-preference reinforcers with respect to response rate and response patterns. Journal of Applied Behavior Analysis, 41(2), 177–188. https://doi.org/10.1901/jaba.2008.41-177

Piazza, C. C., Fisher, W. W., Hagopian, L. P., Bowman, L. G., & Toole, L. (1996). Using a choice assessment to predict reinforcer effectiveness. Journal of Applied Behavior Analysis, 29(1), 1–9. https://doi.org/10.1901/jaba.1996.29-1

Roane, H. S., Lerman, D. C., & Vorndran, C. M. (2001). Assessing reinforcers under progressive schedule requirements. Journal of Applied Behavior Analysis, 34(2), 145–166. https://doi.org/10.1901/jaba.2001.34-145

Roane, H. S. (2008). On the applied use of progressive-ratio schedules of reinforcement. Journal of Applied Behavior Analysis, 41(2), 155–161. https://doi.org/10.1901/jaba.2008.41-155

Roscoe, E. M., Iwata, B. A., & Kahng, S. (1999). Relative versus absolute reinforcement effects: Implications for preference assessments. Journal of Applied Behavior Analysis, 32(4), 479–493. https://doi.org/10.1901/jaba.1999.32-479

Saini, V., Retzlaff, B., Roane, H. S., & Piazza, C. C. (2021). Identifying and enhancing the effectiveness of positive reinforcement. In W. W. Fisher, C. C. Piazza, & H. S. Roane (Eds.), Handbook of
applied behavior analysis (2nd ed., pp. 175–192). The Guilford Press.

Taravella, C. C., Lerman, D. C., Contrucci, S. A., & Roane, H. S. (2000). Further evaluation of low-ranked items in stimulus-choice preference assessments. Journal of Applied Behavior Analysis,
33(1), 105–108. https://doi.org/10.1901/jaba.2000.33-105

Vollmer, T. R., & Iwata, B. A. (1991). Establishing operations and reinforcement effects. Journal of Applied Behavior Analysis, 24(2), 279–291. https://doi.org/10.1901/jaba.1991.24-279

Effects of stimulus presentation order during auditory–visual conditional discrimination training for children with autism spectrum disorder

Jan 5, 2026

Julie E. Cubicciotti, Jason C. Vladescu, and Kenneth F. Reeve
Caldwell University

Regina A. Carroll
University of Nebraska Medical Center’s Munroe-Meyer Institute

Lauren K. Schnell
Hunter College

Abstract

Children with autism spectrum disorder are typically taught conditional discriminations using a match-to-sample arrangement. Consideration should be given to the temporal order in which antecedent stimuli (the sample and comparison stimuli) are presented during match-to-sample trials, as various arrangements have been used in the extant literature. The purpose of the current study was to compare the effects of four stimulus presentation orders on the acquisition of auditory–visual conditional discriminations. The study included participants from a clinically relevant population (three children with autism spectrum disorder), employed clinically relevant teaching procedures, and included two presentation formats not included in previous comparison evaluations (simultaneous and sample-first with re-presentation conditions). Results were found to be learner-specific; that is, a different stimulus presentation format was most efficient for each participant. We provide suggestions to evaluate stimulus control topographies and enhance experimental control in match-to-sample arrangements.

Key words: autism spectrum disorder, conditional discriminations, discrete trial training, instructional efficiency, matching to sample, stimulus control

This article is based on a thesis submitted by the first author, under the supervision of the second author, at Caldwell University in partial fulfillment for the requirements of the Master of Arts in Applied Behavior Analysis.
Address correspondence to Jason C. Vladescu, Department of Applied Behavior Analysis, Caldwell University, 120 Bloomfield Avenue, Caldwell, NJ 07006. E-mail: jvladescu@caldwell.edu
doi: 10.1002/jaba.530
© 2018 Society for the Experimental Analysis of Behavior

When developing the procedural arrangement of match-to-sample (MTS) trials, consideration should be given to the temporal order in which antecedent stimuli (i.e., the comparison and sample stimuli) are presented. The sample stimulus could be presented prior to (i.e., sample-first arrangement) or following (i.e., comparison-first arrangement) the presentation of the comparison stimuli. Additionally, the sample and comparison stimuli could be presented simultaneously (i.e., simultaneous arrangement). The effects of temporal order of stimulus presentations during MTS are particularly applicable to consumers with autism spectrum disorder, considering the frequency with which conditional discriminations are established using MTS paradigms for this population. Consumers with autism spectrum disorder, as opposed to their peers of typical development, often require explicit teaching procedures to facilitate differential responding to environmental stimuli (Grow & LeBlanc, 2013). One difficulty that necessitates and complicates instruction is that faulty stimulus control (from the perspective of the teacher) may develop with consumers with autism spectrum disorder (i.e., weak stimulus control and/or inappropriate stimulus control, such as stimulus overselectivity, stimulus bias, or position bias; Pilgrim, 2015). At present, it is unclear how the temporal order of stimulus presentation may influence the development of stimulus control.

In the sample-first procedure, trials begin with the presentation of a sample stimulus, followed by the presentation of two or more comparison stimuli (e.g., Doughty & Saunders, 2009; Petursdottir & Aguilar, 2016). For a sample stimulus that is transient (e.g., the spoken word “lion”), Green (2001) recommends re-presenting the stimulus every 2 s until the individual responds to a comparison stimulus. This procedural variation was recommended to address the relatively brief window of time the sample stimulus is present, and to increase the likelihood that the individual has the opportunity to observe the sample stimulus. The sample-first procedure has a long history of use in the basic literature with both nonhuman and human participants (e.g., Cumming & Berryman, 1961; Saunders & Spradlin, 1989; Sidman & Tailby, 1982; Skinner, 1950) and has been used with some frequency in the applied literature (e.g., Carp, Peterson, Arkel, Petursdottir, & Ingvarsson, 2012; Groskreutz, Karsina, Miguel, & Groskreutz, 2010; Sprinkle & Miguel, 2012). Additionally, several researchers have explicitly recommended the use of sample-first arrangements when teaching conditional discriminations to consumers with autism spectrum disorder (e.g., Green, 2001).

In the comparison-first arrangement, trials begin with the presentation of two or more comparison stimuli, followed by the presentation of a sample stimulus. In the extant literature, applied researchers have used the comparison first arrangement with some frequency (e.g., Delfs, Conine, Frampton, Shillingsburg, & Robinson, 2014; Dittlinger & Lerman, 2011; Fisher, Kodak, & Moore, 2007; Grannan & Rehfeldt, 2012; Grow, Carr, Kodak, Jostad, & Kisamore, 2011; Grow, Kodak, & Carr, 2014; Hanney & Tiger, 2012; Kodak et al., 2015; McGhan & Lerman, 2013). Further, several early intervention (EI) manuals for consumers with autism spectrum disorder describe the use of the comparison-first arrangement (Leaf & McEachin, 1999; Maurice, Green, & Luce, 1996; Sundberg, & Partington, 1998), and this arrangement has been used to eliminate comparison-only control of responding (Carp et al., 2012; Doughty & Sanders, 2009; McIlvane, Kledaras, Stoddard, & Dube, 1990).

A third antecedent stimulus presentation format, the simultaneous presentation procedure, involves presenting the sample and comparison stimuli at the same time. Multiple applied studies have employed this procedure (e.g., CividiniMotta & Ahearn, 2013; Fisher, Pawich, Dickes, Paden, & Toussaint, 2014; Hausman, Ingvarsson, & Kahng, 2014; Paden & Kodak, 2015; Slocum, Miller, & Tiger, 2012; Sy & Vollmer, 2012; Walker & Rehfeldt, 2012).

Although researchers have used the samplefirst, comparison-first, and simultaneous procedures with success, the resulting studies do not establish the conditions under which one procedural variation may be more efficient than another. Comparison studies allow researchers to evaluate relative efficiency and may provide helpful information to practitioners as to the procedural arrangement that is most beneficial for the consumers they serve.

In this vein, Petursdottir and Aguilar (2016) investigated the effects of antecedent stimulus presentation order during a computer-presented

MTS task by comparing the sample- and comparison-first methods for three children of typical development. In the sample-first condition, the experimenters required the participants to make a trial-initiation response, then the sample stimulus was presented, and then four comparison stimuli were presented. In the comparison-first condition, the experimenters required the participants to make a trialinitiation response, four comparison stimuli were presented, and then the sample stimulus was presented. In both conditions, correct responses produced a 4-s computer animation and sound clip. Incorrect responses produced a 4-s blackout, followed by the next trial. All participants demonstrated mastery level responding faster in the sample-first condition, and these results were replicated for all participants.

The findings of Petursdottir and Aguilar (2016) suggest relative superiority of the sample-first procedure when teaching auditory– visual conditional discriminations. However, it is unclear whether these findings hold true for consumers with autism spectrum disorder. More specifically, Petursdottir and Aguilar used a computer to present trials, arranged differential reinforcement of correct responses from the onset of training, and omitted prompting and prompt-fading strategies. When conducting auditory–visual conditional discrimination training with consumers with autism spectrum disorder, instruction is likely to be delivered via tabletop procedures, use nondifferential reinforcement of prompted and unprompted responses (at least during the early stages of teaching; Vladescu & Kodak, 2010), and arrange prompts and prompt-fading strategies. Future research should evaluate the effects of variations in these procedural aspects on the relative efficiency of antecedent stimulus presentation formats. Additionally, Petursdottir and Aguilar did not include a condition to evaluate the stimulus order procedure recommended by Green (2001) when sample stimuli are transient (sample-first with re-presentation), or a condition that commonly appears in the applied literature (simultaneous presentation).

Therefore, the purpose of the present study was to evaluate the effects and relative efficiency of the sample-first, comparison-first, sample-first with re-presentation, and simultaneous presentation formats on the acquisition of auditory–visual conditional discriminations for three participants with autism spectrum disorder. We evaluated the relative efficiency of these conditions by collecting data on training sessions to mastery, training trials to mastery, and total training time. We included total training time as a dependent variable because it is possible that evaluating responding across different measurement scales (e.g., training sessions vs. training time) may yield different conclusions regarding the relative efficiency of training conditions (e.g., Black et al., 2016). Further, we used instructional components that are commonly used to teach consumers with autism spectrum disorder; that is, instruction was delivered via tabletop materials, prompting and prompt-fading strategies were used, and nondifferential reinforcement of unprompted and prompted correct responses were used during initial training sessions.

Method

Participants

Three children with autism spectrum disorder participated. A parent or teacher of each participant completed the Gilliam Autism Rating Scale-Third Edition (Gilliam, 2013) to document behaviors characteristic of autism spectrum disorder. Ratings for all three participants indicated a very likely probability of autism spectrum disorder. All three participants received intervention based on the principles of applied behavior analysis (ABA) in a suburban public-school classroom.

Zeek was an 8-year, 11-month-old male who had begun receiving services based on the principles of ABA at 20 months of age. He obtained standard scores of 62 (Qualitative Description: Extremely Low) and 45 (Extremely Low) on the Expressive Vocabulary Test-Second Edition (EVT-2; Williams, 2007) and the Peabody Picture Vocabulary Test-Fourth Edition (PPVT-4; Dunn & Dunn, 2007), respectively. Zeek scored into Level 3 of both the visual perceptual/match-to-sample and listener domains of the Verbal Behavior-Milestones Assessment and Placement Program (VB-MAPP; Sundberg, 2008) and scored 32 on the Barriers Assessment of the VB-MAPP.

Max was a 3-year, 11-month-old male who had been receiving ABA-based services for approximately 10 months. He obtained standard scores of 79 (Moderately Low) and 70 (Moderately Low) on the EVT-2 and the PPVT-4, respectively. Max scored into Level 2 on both visual perceptual/ match-to-sample and listener domains of the VB-MAPP, and he scored 32 on the Barriers Assessment of the VB-MAPP.

Adam was a 4-year, 3-month-old male who had been receiving ABA-based services for approximately 15 months. He obtained standard scores of 88 (Low Average) and 69 (Extremely Low) on the EVT-2 and the PPVT-4, respectively. Adam scored into Level 3 of the visual perceptual/match-to-sample domain and into Level 2 of the listener domain of the VB-MAPP, and he scored 22 on the Barriers Assessment of the VB-MAPP.

Setting and Materials

All sessions were conducted in a designated room in each participant’s home. The room contained a worktable, chairs, and the materials necessary for the sessions. Session materials included data sheets, pens, a digital timer, preferred stimuli, stimulus binders, and a video camera. The experimenter sat across from or next to the participant at the table during sessions. All sessions were recorded using a video camera. We created four stimulus binders (one for each condition) per participant to present trials. Each 2-in stimulus binder consisted of the following components: a sheet of colored paper (based on the results of a color preference assessment) attached to the cover of the binder, nine trial sheets (one for each trial in a session) consisting of a white piece of paper containing a horizontal array of three pictures, and a blank colored (specific to the condition) piece of paper atop each trial sheet. Comparison stimuli were either realistic colored pictures of animals or scaled black outlines of states filled in with black, approximately 5.08 cm x 5.08 cm in size. The blank piece of colored paper on top of the binder provided an opportunity for the participants to engage in a differential observing response prior to each session (participants were required to touch the paper and tact the corresponding color). A small colored square (specific to the condition) was placed on the table between the participant and the stimulus binder and provided an opportunity for participants to engage in a trial-initiation response prior to each trial to ensure they were oriented to the materials when the sample stimuli were presented (Saunders & Williams, 1998). A trial-initiation response may be particularly important when sample stimuli are transient. As opposed to an observing response, in which participants make a response to the sample, the trial-initiation response was emitted prior to the presentation of the sample (Green, 2001).

Content Validity

To gauge current practices in stimulus presentation in clinical work, we surveyed three behavior analysts and eight behavior technicians prior to beginning the study. These individuals had an average of 5 years (range, 3 years to 13.5 years) of experience working with individuals with autism spectrum disorder. Respondents viewed a PowerPoint presentation depicting the stimulus presentation formats and then completed a survey to report which presentation format they used in practice. Of those surveyed, five respondents reported using the sample-first procedure most frequently, five respondents reported using the comparison-first procedure most frequently, one respondent reported using the simultaneous procedure most frequently, and no respondents reported using the sample-first with representation procedure most frequently.

Design, Dependent Variable, and Interobserver Agreement

During the treatment evaluation, acquisition of auditory–visual conditional discriminations in the sample-first, the comparison-first, the sample-first with re-presentation, and the simultaneous presentation conditions was compared using an adapted alternating-treatments design (Sindelar, Rosenberg, & Wilson, 1985) embedded within a nonconcurrent multiple-baseline-across-participants design. During each session, the experimenter recorded on a data sheet unprompted and prompted correct and incorrect responses, session duration, and comparison responses prior to the presentation of the sample stimulus during comparison-first trials. Unprompted correct responses were defined as the participant emitting the target response prior to the delivery of the prompt. An unprompted incorrect response was defined as the participant emitting a response other than the target response (i.e., error of commission) or no response (i.e., error of omission) prior to the delivery of a prompt. A prompted correct response was defined as the participant emitting the target response after the delivery of the prompt. A prompted incorrect response was defined as the participant emitting an error of commission or omission after the delivery of the prompt.

As in previous evaluations, the experimenter also measured responding to the comparison array prior to the delivery of the sample stimulus in the comparison-first condition (McIlvane et al., 1990; Petursdottir & Aguilar, 2016). We only scored responses to the comparison stimuli in this condition as unprompted or prompted correct or incorrect after the experimenter presented the sample stimulus.

To record session duration, a digital timer was started immediately before beginning the first trial of the session and stopped immediately following the completion of the last trial of the session. The relative efficiency of the four conditions was evaluated by comparing the total training sessions, total training trials, and total training time to mastery or until termination criteria were met. The total training sessions and trials were calculated by adding all of the sessions and trials until the mastery or termination criteria were met for each condition, respectively. Total training time was calculated by adding the cumulative session time required to reach mastery for all targets in each training condition.

Two secondary independent observers scored at least 33% of sessions in vivo or from video for each condition across phases for interobserver agreement (IOA) purposes. Trial-by-trial IOA was calculated by dividing the number of agreements by the number of agreements plus disagreements and converting to a percentage. An agreement was defined as both observers recording the same participant response during a trial and a disagreement was defined as the observers recording different participant responses during a trial. Mean IOA scores for Zeek, Max, and Adam were 99% (range, 83% to 100%), 99% (range, 83% to 100%), and 100%, respectively, across conditions. In addition, the secondary observer collected data on session duration for IOA purposes. Total duration IOA was calculated by dividing the smaller duration by the larger duration for each session and converting to a percentage. Mean duration IOA scores for Zeek, Max, and Adam were 94% (range, 70% to 100%), 95% (range, 83% to 100%), and 94% (range, 77% to 100%), respectively.

Preference Assessments

Parents of each participant completed an experimenter-created survey to identify putative edible reinforcers. The experimenter conducted a paired-stimulus assessment (Fisher et al., 1992) using the top 10 edibles from the survey prior to the beginning of the evaluation. Prior to each session, the experimenter conducted a brief multiple-stimulus-without-replacement assessment (Carr, Nicolson, & Higbee, 2000) using the top five edibles identified from the paired-stimulus preference assessment, in an attempt to control for shifts in preference. The first three items selected were used as the putative reinforcers for the subsequent session. The experimenter also conducted a paired-stimulus color preference assessment (Heal, Hanley, & Layer, 2009) using colored pieces of paper and items to determine participant preference for 10 colors. Four colors that were approached during an approximately equal percentage of trials (to reduce any bias towards one color) were assigned as condition-correlated stimuli.

Target Identification and Assignment

To identify targets for the treatment comparison, we first assembled a pool of potential targets based on each participant’s individual educational goals. The experimenter conducted four pretest trials for each potential target in a random order without replacement (every potential target was presented once, in random order, before it was presented again). One trial was conducted in each presentation format

(i.e., sample-first, comparison-first, sample-first with re-presentation, and simultaneous presentation). If a participant engaged in more than one unprompted correct response for a potential target during pretest trials, it was discarded. The experimenter assigned three targets to each of the four conditions (see Table 1) using a logical analysis (Wolery, Gast, & Ledford, 2014). The logical analysis considered the following dimensions: number of syllables in each target name, redundancy of phonemes across target name, and physical similarity (e.g., orientation, color, size, shape) across comparison stimuli.

General Procedure

Each target was presented three times during a session (i.e., 3 targets x 3 trials = 9 trials per session). At least one session per experimental condition was conducted per day, 1 to 5 days per week with a minimum of 5 min between each session. We conducted sessions for each condition in random order without replacement. For Max and Adam, training continued until the participant demonstrated 100% unprompted correct responding across two consecutive sessions. For Zeek, training continued until the participant demonstrated 89% unprompted correct responding across two consecutive sessions. Mastery criteria were selected to match the criteria arranged in each participant’s educational setting. Training was conducted in the other conditions for a minimum of three additional sessions and a total training time that was 20% or more than the condition mastered first. Once these criteria were met, training was discontinued in a condition as long as there was no apparent increasing trend in unprompted correct responding.

Table 1 Target Sets

Condition	Zeek	Max	Adam
Sample first	New Hampshire	Chinchilla	Porcupine
	Indiana	Owl	Leopard
	Oklahoma	Skunk	Seal
Comparison first	New Jersey	Guinea pig	Chinchilla
	Alabama	Leopard	Owl
	Mississippi	Seal	Skunk
Sample first with re-presentation	Idaho	Scorpion	Scorpion
	Arizona	Orca	Weasel
	North Dakota	Wolf	Boar
Simultaneous	Wisconsin	Platypus	Guinea pig
	Connecticut	Ostrich	Pigeon
	New Mexico	Bat	Wolf

A constant prompt delay procedure was used in all conditions. A 0-s prompt delay was implemented during the initial training sessions across conditions. During 0-s prompt delay trials, the experimenter provided an immediate model prompt (i.e., the experimenter touched the correct comparison stimulus with her finger) following the presentation of antecedent stimuli. During 0-s prompt delay trials, prompted correct responses resulted in the delivery of praise and an edible. If the participant engaged in a prompted incorrect response, the experimenter removed materials and presented the next trial. We continued to present 0-s prompt delay trials until the participant demonstrated 100% correct prompted responding for two consecutive sessions. Then, the experimenter increased the prompt delay to 5 s. During these trials, if the participant engaged in an unprompted correct response, the experimenter delivered an edible and praise. Following unprompted incorrect responses, the experimenter re-presented the sample stimulus and modeled the correct response and allowed the participant 5 s to respond. If the participant engaged in a prompted incorrect response following the model, the experimenter presented the next trial. If the participant engaged in a prompted correct response, the experimenter provided an edible and praise and then presented the next trial. We delivered only praise following prompted correct responses once the participant demonstrated unprompted correct responding during at least 50% of trials. Prior to the beginning of each trial, participants engaged in a trial-initiation response by touching the condition-correlated colored square placed in front of the stimulus binder (we conducted trial-initiation response training prior to the evaluation; contact second author for details). Once participants engaged in the trial-initiation response, the experimenter provided the antecedent stimuli based on condition-specific procedures.

Baseline. During baseline, the experimenter presented antecedent stimuli (the sample and comparison stimuli) according to conditionspecific procedures (see below) and allowed the participant 5 s to respond. Following unprompted correct and incorrect responses, the experimenter provided a brief verbal statement (e.g., “okay”), then presented the next trial. The experimenter delivered an edible and praise for appropriate collateral behavior (e.g., sitting appropriately at the table) approximately every other trial during the intertrial interval in an attempt to maintain participant responding.

Sample first. For each trial, the participant engaged in the trial-initiation response and the experimenter presented the sample stimulus (e.g., “pigeon”). Immediately after the offset of the sample stimulus, the experimenter removed the blank piece of colored paper to reveal the trial sheet containing the three comparison stimuli.

Sample first with re-presentation. All procedures in place for the sample-first condition were the same. In addition, the experimenter presented the sample stimulus for a second time (e.g., “pigeon”) immediately following the removal of the piece of paper covering the comparison stimuli.

Comparison first. The participant engaged in the trial-initiation response, the experimenter removed the blank piece of colored paper to reveal the trial sheet, waited 3 s, and then presented the auditory sample stimulus (e.g., “pigeon”).

Simultaneous presentation. The participant engaged in the trial-initiation response, then the experimenter removed the blank piece of colored paper to reveal the trial sheet and simultaneously presented the auditory sample stimulus (e.g., “pigeon”).

Procedural Integrity and Procedural Integrity IOA

An independent observer scored the integrity with which the experimenter implemented the condition-specific teaching components (i.e., presented stimulus binder, prompted trial initiation response, presented antecedent stimuli in correct sequence, implemented correct prompt delay, provided correct consequence for correct and incorrect responses, recorded data, and provided appropriate intertrial intervals) for a minimum of 33% of sessions across conditions in vivo or from video. We calculated the percentage of integrity by dividing the total number of skills performed correctly by the total number of opportunities to perform a skill and multiplying by 100. Mean treatment integrity scores were 100% for all participants. A secondary observer also collected procedural integrity data for a minimum of 33% of treatment integrity sessions for IOA purposes. IOA data were calculated on a trial-by-trial basis, where the total number of agreements was divided by the total number of agreements plus disagreements, then converted to a percentage. Mean treatment integrity IOA scores were 100% for Zeek, Max, and Adam.

Results

Figure 1 represents the percentage of unprompted correct responses during baseline and all teaching conditions for Zeek, Max, and Adam. During baseline, all participants engaged in low to moderate levels of unprompted correct responses across conditions. Zeek demonstrated mastery in the comparison-first condition in 23 training sessions (207 training trials; 44 min 40 s training time). He did not achieve mastery in the other three conditions and training ceased according to the termination criteria. Zeek responded to the comparison array in the comparison-first condition prior to the delivery of the sample stimulus (represented by the gray bars in Figure 1) only during baseline (67% of baseline sessions; mean of 16.5% [range, 11% to 22%] of baseline trials in those sessions). No such responding was observed after training was initiated.

Max demonstrated mastery in the sample-first condition in 21 training sessions (189 training trials; 40 min 47 s training time) and in the sample-first with re-presentation condition in 22 training sessions (198 training trials; 42 min 2 s training time). He did not demonstrate mastery of target responses in the comparison-first or the simultaneous conditions. Max engaged in
early responses to the comparison array in the comparison-first condition during baseline (29% of baseline sessions; mean of 16.5% [range, 11% to 22%] of baseline trials in those sessions) and training (67% of training sessions; mean of 23% [range, 11% to 66%] of training trials in those sessions).

Figure 1. The percentage of unprompted correct responses across the sample-first (SF), sample-first with representation (SFRP), comparison-first (CF), and simultaneous (Sim) treatment conditions. Gray bars represent percentage of trials with early responses to the comparison stimuli in the comparison-first condition.

Figure 2. The total number of training trials for Zeek, Max, and Adam across the sample-first (SF), sample-first with re-presentation (SFRP), comparison-first (CF), and simultaneous (Sim) treatment conditions. Asterisks indicate that mastery was achieved in the number of sessions depicted (in the absence of an asterisk, mastery was not achieved).

Figure 3. The number of total training time (minutes) for Zeek, Max, and Adam across the sample first (SF), sample-first with re-presentation (SFRP), comparison-first (CF), and simultaneous (Sim) treatment conditions. Asterisks indicate that mastery was achieved in the number of sessions depicted (in the absence of an asterisk, mastery was not achieved).

Unlike Zeek and Max, Adam mastered target responses in all conditions. More specifically, he demonstrated mastery in 6 training sessions (54 training trials; 7 min 29 s training time) in the simultaneous condition, 8 training sessions (72 training trials; 12 min 2 s training time) in the sample-first with re-presentation condition, 11 training sessions (99 training trials; 16 min 37 s training time) in the sample-first condition, and 12 training sessions (108 training trials; 17 min 37 s training time) in the comparison-first condition. Similar to Zeek, Adam responded to the comparison array prior to the presentation of the sample stimulus in the comparison-first condition during baseline (33% of baseline sessions; mean of 11% of baseline trials in those sessions), but we did not observe such responding after training was initiated. Across participants, we initiated training with trials conducted with a 0-s prompt delay so unprompted correct responding was at zero until we increase the prompt delay to 5 s.

Figures 2 and 3 summarize the total training trials and total training time for all conditions for Zeek, Max, and Adam. We included a summary for conditions in which participants did not demonstrate mastery to show that these measures met the termination criteria when compared to the condition in which participants demonstrated mastery the fastest. For all participants, the lowest total training time was associated with the condition in which participants demonstrated mastery level responding in the fewest number of training sessions.

Discussion

When arranging MTS trials, consideration should be given to the order in which the measured by total training trials and total duration, was learner specific across three children with autism spectrum disorder. The simultaneous procedure was associated with the fastest acquisition for Adam, the comparison-first arrangement was associated with the fastest acquisition for Zeek, and the sample-first and sample-first-with-re-presentation arrangements were associated with similarly fast acquisition for Max. This finding is consistent with the growing body of skill acquisition research that has demonstrated learner-specific outcomes (e.g., Boudreau, Vladescu, Kodak, Argott, & Kisamore, 2015; Carroll, Joachim, St. Peter, & Robinson, 2015; Rodgers & Iwata, 1991). The results suggest that students may benefit when teachers identify and implement a student specific stimulus presentation format when teaching auditory–visual conditional discriminations, rather than using one procedure across all students as has been previously suggested (Green, 2001; Leaf & McEachin, 1999; Maurice et al., 1996; Sundberg, & Partington, 1998).

One potential avenue to identify a consumer-specific stimulus presentation format is through an initial assessment. In a similar vein, recent studies have undertaken efforts to evaluate the usefulness of assessments to identify consumer-specific error-correction procedures (McGhan & Lerman, 2013), prompt type and prompt-fading procedures (Seaver & Bourret, 2014), and reinforcement arrangements (Johnson, Vladescu, Kodak, & Sidener, 2017). However, the current evaluation falls short in providing prescriptive information as to which presentation order should be used in subsequent training of auditory–visual conditional discriminations, because we did not include intrasubject replication. Future studies should include intrasubject replications to establish the reliability of outcomes and evaluate generality to other types of conditional discriminations (e.g., visual–visual conditional discriminations).

For all participants, the data from all four conditions were undifferentiated until mastery was achieved in the first condition or conditions. This indicates that differences in trials to mastery could be, in part, a result of uncontrolled factors. We attempted to address differences among stimuli across conditions and to assign stimuli to the four stimulus sets for each participant so as to ensure equivalence of those sets. However, differences in characteristics among stimuli associated with each condition could have contributed to inconsistent findings across participants. Further, although targets for each participant were selected based on educational goals and previous learning history, we cannot rule out that targets for Zeek (states) and those for Max and Adam (animals) were not of equal difficulty, especially considering Zeek only met the mastery criteria in one condition. Participant instructional history indicated that participants had, at most, three auditory–visual conditional discrimination targets in training at a time. Therefore, it is possible that including twelve concurrent instructional targets in one domain may have impacted acquisition for our participants.

Although we took steps to minimize the possibility of interaction effects—by conducting sessions in a random order without replacement, assigning condition-correlated stimuli (colors), and requiring a minimum of 5 min to elapse between consecutive sessions—we may have observed multiple-treatment interference. That is, a participant’s experience in a treatment session in one condition may have influenced his responding in the subsequent treatment session in another condition (Higgins Hains & Baer, 1989). Future researchers could further minimize the possibility of multiple-treatment interference by increasing the minimum time between sessions of different conditions (e.g., alternate sessions by day). Future researchers could consider including a choice or preference measure for instructional conditions (e.g., Heal et al., 2009), as research suggests that giving participants a choice may be valuable to participants (e.g., Brigham & Sherman, 1973; Tiger, Hanley, & Hernandez, 2006) and may be associated with a decrease in problem behavior (Dyer, Dunlap, & Winterling, 1990).

The findings of the current evaluation contrast with those of Petursdottir and Aguilar (2016), who found that the sample-first procedure was consistently the most efficient across participants. In comparing the present evaluation to the one conducted by Petursdottir and Aguilar, several differences should be noted. Petursdottir and Aguilar presented stimuli via a computer, arranged differential reinforcement from the onset of instruction, and did not include prompting and prompt-fading strategies, whereas we delivered stimuli via tabletop procedures, arranged nondifferential reinforcement during the initial stages of acquisition, and included prompts and a prompt-fading strategy. Although the exact impact of these differences is unknown, it is possible that certain procedural features (e.g., prompt and prompt-fading strategies) may reduce the impact of antecedent stimuli presentation order. Further research is needed to determine how these procedural features influence the development of conditional stimulus control when using different presentation formats.

We included two conditions not evaluated by Petursdottir and Aguilar (2016), and our participants were diagnosed with autism spectrum disorder. Petursdottir and Aguilar did not include the simultaneous or sample-first-withrepresentation conditions, so it is unclear whether these conditions would have been superior for the participants in their study, as they were for Adam and Max in the current evaluation. Whereas Petursdottir and Aguilar’s participants were children of typical development, ours were from a clinically relevant population to examine the relative efficiency of stimulus presentation formats for consumers for whom match-to-sample is commonly used to establish auditory–visual conditional discriminations. One variable that may explain the difference in findings across the current participants is their learning histories. All participants had past and current instructional goals related to establishing auditory–visual conditional discriminations, and therefore had likely been exposed to one or more stimulus presentation formats. This history may be relevant, as previous research (Coon & Miguel, 2012; Freeman & Lattal, 1992) has demonstrated the influence of proximal history on subsequent responding. Future studies should establish the generality of findings through intrasubject replications and evaluate these conditions using participants without established ABA instructional histories.

Interestingly, Max, who reached mastery first in the sample-first and sample-first-with- representation conditions, was the only participant to respond to the comparison array prior to the delivery of the sample stimulus during comparison-first training trials, and to fail to demonstrate mastery in the comparison-first condition. Moreover, Max’s propensity to respond to the comparison array prior to the delivery of the sample stimulus was absent at the beginning of training (although present during baseline) and emerged only after exposure to this condition. These data seem to contrast with previous research (McIlvane et al., 1990) in that we observed an increase, rather than a decrease, in comparison responding prior to the delivery of the sample. Similar to McIlvane et al. (1990), we presented the sample when 3 s elapsed without the participant responding to the comparison array. However, this delay may not have been sufficient to promote appropriate comparison control, and future studies could evaluate the effect of longer delays or alternative procedures (e.g., represent the trial; Petursdottir & Aguilar, 2016).

Future researchers interested in stimulus presentation order should consider a number of factors. First, the current study did not analyze stimulus control topographies across conditions. Therefore, we cannot draw conclusions as to whether any of the presentation formats may promote or reduce irrelevant sources of stimulus control (e.g., position or stimulus biases). Future researchers could collect data on participant responding (e.g., specific comparison stimulus and position selected each trial) to allow for an analysis of undesirable performance patterns (see Fields, Garruto, & Watanabe, 2010).

Second, we did not collect data related to generalization (e.g., exemplars containing variation in noncritical features). Given that the stimulus presentation formats may differentially influence the development of stimulus control, these conditions will necessarily differentially influence the degree to which participants demonstrate generalized responding. Future researchers could conduct tests to determine whether correct responding occurs when stimulus exemplars not associated with training are presented during probe trials.

Third, future researchers could examine whether manipulating additional variables related to comparison and sample stimuli may influence the relative efficiency of stimulus presentation arrangements. For example, it is possible that a specific stimulus presentation order more efficiently establishes stimulus control when an increasing number of stimuli are arranged as comparisons. Increasing the number of comparisons may increase the difficulty of the simultaneous simple discrimination required between comparison stimuli, and in turn, the order in which sample and comparisons are presented may be more relevant.

Three additional limitations are worth mentioning. First, similar to Petursdottir and Aguilar (2016), we did not require an observing response as is typically used in basic research. That is, the presentation of the sample stimulus (in the comparison-first condition) or comparison stimuli (in the sample-first and sample-first with re-presentation conditions) was not contingent on a participant response. Rather, we required a trial-initiation response (touching a colored square of paper). The trial-initiation response could be considered an observing response in that it increases the likelihood that the participant will make sensory contact with the first stimulus presented. It should be noted, however, that it is fairly common practice not to require an observing response or differential observing response in applied studies that target auditory–visual conditional discriminations for participants with autism spectrum disorder (e.g., Carey & Bourret, 2014; Carp et al., 2012; Delfs et al., 2014; Dittlinger & Lerman, 2011; Fisher et al., 2014; Haq et al., 2015; McGhan & Lerman, 2013; Paden & Kodak,

2015). Moreover, recent applied research did not demonstrate consistently superior auditory– visual conditional discrimination acquisition in a condition that required a differential observing response relative to a condition that did not (Vedora, Barry, & Ward-Horner, 2017). Future researchers could evaluate the role of a trial-initiation response and determine what role it may play in establishing conditional stimulus control. Additionally, future studies are needed to clarify the conditions under which an observing response or differential observing response to the sample are necessary during conditional discrimination training.

Second, we did not continue training to mastery in all conditions. That is, once mastery was achieved in one condition, training continued in the other conditions for at least three sessions and 20% additional training time as long as no apparent increasing trend in performance was observed. Training termination was necessary for two participants. As we did not know how much additional training would have been required to achieve mastery in all conditions for all participants, we felt an additional training time of 20% to be substantial enough to make conclusions regarding relative efficiency. We decided to discontinue training to prevent the possible establishment of the presentation of instructional stimuli as a conditioned reflexive motivating operation (Carbone, Morgenstern, Zecchin-Tirri, & Kolberg, 2007), to ensure that we completed the evaluation for all participants prior to the end of the school year, and to maximize the time participants spent receiving effective intervention.

In summary, stimulus presentation order may be an important factor for auditory–visual conditional discrimination acquisition for children with autism spectrum disorder. Future researchers may investigate whether initial assessments, previous learning history, and specific barriers to learning have implications for which stimulus presentation method is most efficient or effective. Further, an assessment of generalization and an analysis of specific response patterns (e.g., stimulus control topographies) could help distinguish what sources of control each condition has on responding.

References

Black, M. P., Skinner, C. H., Forbes, B. E., McCurdy, M., Coleman, M. B., Davis, K., & Gettelfinger, M. (2016). Cumulative instructional time and relative effectiveness conclusions: Extending research on response intervals, learning, and measurement scale. Behavior Analysis in Practice, 9, 58-62. https://doi.org/10.1007/240617-016-0114-3

Boudreau, B. A., Vladescu, J. C., Kodak, T. M., Argott, P., & Kisamore, A. N. (2015). A comparison of differential reinforcement procedures on the acquisition of tacts in children with autism. Journal of Applied Behavior Analysis, 48, 918-923. https://doi.org/10.1002/jaba.232

Brigham, T. A., & Sherman, J. A. (1973). Effects of choice and immediacy of reinforcement on single response and switching behavior of children. Journal of the Experimental Analysis of Behavior, 19, 425-435. doi:10.901/jeab.1973.19-425

Carbone, V. J., Morgenstern, B., Zecchin-Tirri, G., & Kolberg, L. (2007). The role of the reflexive conditioned motivating operation (CMO-R) during discrete trial instruction of children with autism. Journal of Early and Intensive Behavior Intervention, 4, 658. https://doi.org/10.1037/h0100399

Carey, M. K., & Bourret, J. C. (2014). Effects of data sampling on graphical depictions of learning. Journal of Applied Behavior Analysis, 47, 749-764. doi:https://doi.org/10.1002/jaba.153

Carp, C. L., Peterson, S. P., Arkel, A. J., Petursdottir, A. I., & Ingvarsson, E. T. (2012). A further evaluation of picture prompts during auditory– visual conditional discrimination training. Journal of Applied Behavior Analysis, 45, 737-751. https://doi.org/10.1901/jaba.2012.45-737

Carr, J. E., Nicolson, A. C., & Higbee, T. S. (2000). Evaluation of a brief multiple-stimulus preference assessment in a naturalistic context. Journal of Applied Behavior Analysis, 33, 353-357. https://doi.org/10.1901/jaba.2000.33-353

Carroll, R. A., Joachim, B. T., St. Peter, C. C., & Robinson, N. (2015). A comparison of error correction procedures on skill acquisition during discrete-trial instruction. Journal of Applied Behavior Analysis, 48, 257-273. https://doi.org/10.1002/jaba.205

Cividini-Motta, C., & Ahearn, W. H. (2013). Effects of two variations of differential reinforcement on prompt dependency. Journal of Applied Behavior Analysis, 46, 640-650. https://doi.org/10.1002/jaba.67

Coon, J. T., & Miguel, C. F. (2012). The role of increased exposure to transfer-of-stimulus-control procedures on the acquisition of intraverbal behavior. Journal of Applied Behavior Analysis, 45, 657-666. https://doi.org/10.1901/jaba.2012.45-657

Cumming, W. W., & Berryman, R. (1961). Some data on matching behavior in the pigeon. Journal of the Experimental Analysis of Behavior, 4, 281-284. https://doi.org/10.1901/jeab.1961.4-281

Delfs, C. H., Conine, D. E., Frampton, S. E., Shillingsburg, M. A., & Robinson, H. C. (2014). Evaluation of the efficiency of listener and tact instruction for children with autism. Journal of Applied Behavior Analysis, 47, 793-809. https://doi.org/10.1002/jaba.166

Dittlinger, L. H., & Lerman, D. C. (2011). Further analysis of picture interference when teaching word recognition to children with autism. Journal of Applied Behavior Analysis, 44, 341-349. https://doi.org/10.1901/jaba.2011.44-341

Doughty, A. H., & Saunders, K. J. (2009). Decreasing errors in reading-related matching to sample using a delayed-sample procedure. Journal of Applied Behavior Analysis, 42, 717-721. https://doi.org/10.1901/jaba.2009.42-717

Dunn, M., & Dunn, L. M. (2007). Peabody picture vocabulary test (4^thed.). Circle Pines, MN: AGS.

Dyer, K., Dunlap, G., & Winterling, V. (1990). Effects of choice making on the serious problem behaviors of students with severe handicaps. Journal of Applied Behavior Analysis, 23, 515-524. doi:https://doi.org/10.1901/jaba.1990.23-515

Fields, L., Garruto, M., & Watanabe, M. (2010). Varieties of stimulus control in matching-to-sample: A kernel analysis. The Psychological Record, 60, 3-26. https://doi.org/10.1007/BF03395691

Fisher, W. W., Kodak, T., & Moore, J. W. (2007). Embedding an identity-matching task within a prompting hierarchy to facilitate acquisition of conditional discriminations in children with autism. Journal of Applied Behavior Analysis, 40, 489-499. https://doi.org/10.1901/jaba.2007.40-489

Fisher, W. W., Pawich, T. L., Dickes, N., Paden, A. R., & Toussaint, K. (2014). Increasing the saliency of behavior–consequence relations for children with autism who exhibit persistent errors. Journal of Applied Behavior Analysis, 47, 738-748. https://doi.org/10.1002/jaba.172

Fisher, W. W., Piazza, C. C., Bowman, L. G., Hagopian, L. P., Owens, J. C., & Slevin, I. (1992). A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilities. Journal of Applied Behavior Analysis, 25, 491498. doi:https://doi.org/10.1901/jaba.1992.25-491

Freeman, T. J., & Lattal, K. A. (1992). Stimulus control of behavioral history. Journal of the Experimental Analysis of Behavior, 57, 5-17. https://doi.org/10.1901/jeab1992.57.5

Gilliam, J. E. (2013). Gilliam autism rating scale (3^rded.). Austin, TX: Pro-Ed.

Grannan, L., & Rehfeldt, R. A. (2012). Emergent intraverbal responses via tact and match-to-sample instruction. Journal of Applied Behavior Analysis, 45, 601605. https://doi.org/10.1901/jaba.2012.45-601

Green, G. (2001). Behavior analytic instruction for learners with autism: Advances in stimulus control technology. Focus on Autism and Other Developmental Disabilities, 16, 72-85. https://doi.org/10.1177/108835760101600203

Groskreutz, N. C., Karsina, A., Miguel, C. F., & Groskreutz, M. P. (2010). Using complex auditory– visual samples to produce emergent relations in children with autism. Journal of Applied Behavior Analysis, 43, 131-136. https://doi.org/10.1901/jaba/2010.43-131

Grow, L. L., Carr, J. E., Kodak, T., Jostad, C. M., & Kisamore, A. N. (2011). A comparison of methods for teaching receptive labeling to children with autism spectrum disorder. Journal of Applied Behavior Analysis, 44, 475-498.

Grow, L., & LeBlanc, L. (2013). Teaching receptive language skills: Recommendations for instructors. Behavior Analysis in Practice, 6, 56-75. https://doi.org/10.1007/BF03391791

Grow, L. L., Kodak, T., & Carr, J. E. (2014). A comparison of methods for teaching receptive labeling to children with autism spectrum disorders: A systematic replication. Journal of Applied Behavior Analysis, 47, 600-605. https://doi.org/10.1002/jaba.141

Hanney, N. M., & Tiger, J. H. (2012). Teaching coin discrimination to children with visual impairments. Journal of Applied Behavior Analysis, 45, 167-172. 0.1901/jaba.2012.45-167

Haq, S. S., Kodak, T., Kurtz-Nelson, E., Porritt, M., Rush, K., & Cariveau, T. (2015). Comparing the effects of massed and distributed practice on skill acquisition for children with autism. Journal of Applied Behavior Analysis, 48, 454-459. https://doi.org/10.1002/jaba.213

Hausman, N. L., Ingvarsson, E. T., & Kahng, S. (2014). A comparison of reinforcement schedules to increase independent responding in individuals with intellectual disabilities. Journal of Applied Behavior Analysis, 47, 55-159. https://doi.org/10.1002/jaba.85

Heal, N. A., Hanley, G. P., & Layer, S. A. (2009). An evaluation of the relative efficacy of and children’s preferences for teaching strategies that differ in amount of teacher directedness. Journal of Applied Behavior Analysis, 42, 123-143. https://doi.org/10.1901/jaba.2009.42-123

Higgins Hains, A., & Baer, D. M. (1989). Interaction effects in multielement designs: Inevitable, desirable, and ignorable. Journal of Applied Behavior Analysis, 22, 57-69.

Johnson, K. A., Vladescu, J. C., Kodak, T., & Sidener, T. M. (2017). An assessment of differential reinforcement procedures for learners with autism spectrum disorder. Journal of Applied Behavior Analysis, 50, 1-14. https://doi.org/10.1002/jaba.372

Kodak, T., Clements, A., Paden, A. R., LeBlanc, B., Mintz, J. & Toussaint, K. A. (2015). Examination of the relation between an assessment of skills and performance on auditory–visual conditional discriminations for children with autism spectrum disorder. Journal of Applied Behavior Analysis, 48, 52-70. https://doi.org/10.1002/jaba.160

Leaf, R., & McEachin, J. (1999). A work in progress: Behavior management strategies and a curriculum for intensive behavioral treatment of autism. New York: DRL Books.

Maurice, C., Green, G., & Luce, S. C. (1996). Behavioral intervention for young children with autism. Austin, TX: PRO-ED.

McGhan, A. C., & Lerman, D. C. (2013). An assessment of error-correction procedures for learners with autism. Journal of Applied Behavior Analysis, 46, 626639. https://doi.org/10.1002/jaba.65

McIlvane, W. J., Kledaras, J. B., Stoddard, L. T., & Dube, W. V. (1990). Delayed sample presentation in MTS: Some possible advantages for teaching individuals with developmental limitations. Experimental Analysis of Human Behavior Bulletin, 8, 31-33.

Paden, A. R., & Kodak, T. (2015). The effects of reinforcement magnitude on skill acquisition for children with autism. Journal of Applied Behavior Analysis, 48, 924-929. https://doi.org/10.1002/jaba.239

Petursdottir, A. I., & Aguilar, G. (2016). Order of stimulus presentation influences children’s acquisition in receptive identification tasks. Journal of Applied Behavior Analysis, 49, 58-68. https://doi.org/10.1002/jaba.264

Pilgrim, C. (2015). Stimulus control and generalization. In F. D. DiGennaro & D. D. Reed (Eds.), Autism service delivery (pp. 25-74). New York, NY: Springer.

Rodgers, T. A., & Iwata, B. A. (1991). An analysis of error-correction procedures during discrimination training. Journal of Applied Behavior Analysis, 24, 775-781. https://doi.org/10.1901/jaba.1991.24-775

Saunders, K. J., & Spradlin, J. E. (1989). Conditional discrimination in mentally retarded adults: The effect of training the component simple discriminations. Journal of the Experimental Analysis of Behavior, 52, 1-12. https://doi.org/10.1901/jeab.1989.52-1

Saunders, K. J., & Williams, D. C. (1998). Stimulus control procedures. In K. A. Lattal & M. Perone (Eds.), Handbook of research methods in human operant behavior (pp. 213). New York, NY: Plenum Press.

Seaver, J. L., & Bourret, J. C. (2014). An evaluation of response prompts for teaching behavior chains. Journal of Applied Behavior Analysis, 47, 777-792. https://doi.org/10.1002/jaba.159

Sidman, M., & Tailby, W. (1982). Conditional discrimination vs. matching to sample: An expansion of the testing paradigm. Journal of the Experimental Analysis of Behavior, 37, 5-22. https://doi.org/10.1901/jeab.1982.37-5

Sindelar, P. T., Rosenberg, M. S., & Wilson, R. J. (1985). An adapted alternating treatments design for instructional research. Education and Treatment of Children, 8, 67-76.

Skinner, B. G. (1950). Are theories of learning necessary? Psychological Review, 57, 193-216.

Slocum, S. K., Miller, S. J., & Tiger, J. H. (2012). Using a blocked-trials procedure to teach identity matching to a child with autism. Journal of Applied Behavior Analysis, 45, 619-624. https://doi.org/10.1901/jaba.2012.45-619

Sprinkle, E. C., & Miguel, C. F. (2012). The effects of listener and speaker training on emergent relations in children with autism. The Analysis of Verbal Behavior, 28, 111. PMC3363411

Sundberg, M. L. (2008). Verbal behavior milestones assessment and placement program: The VB-MAPP. Concord, CA: AVB Press.

Sundberg, M. L., & Partington, J. W. (1998). Teaching language to children with autism or other developmental disabilities. Danville, CA: Behavior Analysts.

Sy, J. R., & Vollmer, T. R. (2012). Discrimination acquisition in children with developmental disabilities under immediate and delayed reinforcement. Journal of Applied Behavior Analysis, 45, 667-684. https://doi.org/10.1901/jaba.2012.45-667

Tiger, J. H., Hanley, G. P., & Hernandez, E. (2006). An evaluation of the value of choice with preschool children. Journal of Applied Behavior Analysis, 39, 1-16. doi:https://doi.org/10.1901/jaba.2006.158-04

Vedora, J., Barry, T., & Ward-Horner, J. C. (2017). An evaluation of differential observing responses during receptive label training. Behavior Analysis in Practice, 22, 1-6. https://doi.org/10.1007/s40617-017-0188-6

Vladescu, J. C., & Kodak. T. (2010). A review of recent studies on differential reinforcement during skill acquisition in early intervention. Journal of Applied Behavior Analysis, 43, 351-355. https://doi.org/10.1901/jaba.2010.43-351

Walker, B. D., & Rehfeldt, R. A. (2012). An evaluation of the stimulus equivalence paradigm to teach singlesubject design to distance education students via Blackboard. Journal of Applied Behavior Analysis, 45, special education and behavioral science (pp. 297-345). 329-344. https://doi.org/10.1901/jaba.2012.45-329 New York, NY: Routledge.

Williams, K. T. (2007). Expressive vocabulary test (2^nded.). Minneapolis, MN: Pearson Assessments.

Received July 27, 2017
Final acceptance May 17, 2018
Action Editor, Anna Petursdottir

A Tutorial for Implementing Matrix Training in Practice

Jan 5, 2026

Sarah E. Frampton · Judah B. Axe
Accepted: 15 July 2022
© Association for Behavior Analysis International 2022

Abstract

Matrix training consists of arranging targets for instruction to promote fine-grained stimulus control resulting in the establishment of skills without direct training. Recent reviews of the matrix training literature (Curiel et al., 2020a, b.; Kemmerer et al., 2021) highlighted the efficacy and efficiency of the approach with learners with and without disabilities. These reviews noted substantial variations in procedures across studies, suggesting the approach may be flexibly deployed across content areas and teaching procedures. This outcome is positive for practitioners as they may customize matrix training to meet the unique needs of their clients. However, it also necessitates decision making to sort through the variations in the literature. This tutorial was developed to help practitioners weigh various considerations when using matrix training. Tools and resources are provided to illustrate and accelerate adoption into practice settings.

Keywords Efficiency · Matrix training · Recombinative generalization · Skill acquisition · Tutorial

Judah B. Axe
judah.axe@simmons.edu

May Institute, Inc., Randolph, MA, USA
Simmons University, Boston, MA, USA

Teaching generative responses under the control of the correct stimuli in a reasonable number of trials, with procedures that are relatively feasible to deploy, is no easy feat. Yet practitioners are asked to do this (and more) daily in service of their clients. One such skill is responding to two-component stimuli or instructions through tacting and listener responding (Skinner, 1957). These skills are addressed in the Verbal Behavior Milestones Assessment and Placement Program (VB-MAPP; Saaybi et al., 2019; Sundberg, 2008). In particular, milestone 9-M under tacts is: “Tacts . . . two-component verb-noun or noun-verb combinations . . . (e.g., washing face, Joe swinging, baby sleeping).” In addition, milestone 9-M under listener responding is: “Follows 50 two-component noun–verb and/ or verb–noun instructions (e.g., Show me the baby sleeping. Push the swing).” These two-component skills may be taught using any prompting/fading procedure, such as time delay and most-to-least prompting. In addition, these skills may be taught efficiently using matrix training, a method of systematically arranging learning targets to promote generative responding (Frampton et al., 2016; Goldstein, 1983).

Matrix training, which is less of a training procedure and more of a planning process, comprises several steps. First, the matrix is designed with two or more dimensions with the components of the target skills isolated on each axis (e.g., actions on one axis and objects on the other axis; Fig. 1, example 1). Within the cells of the matrix are the learning targets consisting of combinations of the components. For example, Frampton et al. (2019) designed twodimensional matrices with colors on one axis and shapes on the other axis. The cells consisted of the varying colorshape combinations (e.g., red star, blue circle, blue star, red circle). Goldstein and Mousetis (1989) created three-dimensional matrices with objects, prepositions, and locations on the axes. The cells consisted of object-preposition-location combinations (e.g., button-under-bed, penny-behindcouch). Matrices may include varying numbers of components on each axis, and additional components increase the number of combinations exponentially (e.g., 3 × 3 = 9; 4 × 4 = 16; 5 × 5 = 25).

Second, cells within the matrix are strategically selected for training. Two common approaches are nonoverlap/diagonal training and overlap/stepwise training (Fig. 2). Third, after training those cells, probes (i.e., tests) of untrained combinations may reveal that additional combinations were learned without being taught. This outcome has been described as recombinative generalization, “producing or responding to novel utterances; when familiar stimuli are recombined in novel ways” (Goldstein & Mousetis, 1989, p. 246). Fourth, effects beyond the training matrix may be observed (Kemmerer et al., 2021). That is, following training within one matrix, responses within an untrained matrix with known components may be established (e.g., Axe & Sainato, 2010; Frampton et al., 2016, 2019; Marya et al., 2021).

Fig. 1 Example matrices

Fig. 2 Matrix design and training options

The efficacy of matrix training has been demonstrated with varying types of skills and with learners with and without disabilities (see Curiel et al., 2020a, b; Kemmerer et al., 2021). This efficacy may be due to the development of strong stimulus control over unique elements of responses (Goldstein & Mousetis, 1989). Consider the example of a color–shape tact in which the desired outcome is tacting the color under the control of the color element of the stimulus and tacting the shape under the control of the shape element of the stimulus. Matrix training necessitates breaking complex responses into smaller units, consistent with Palmer’s (2012) description of atomic repertoires. The more precisely the behavioral units can be isolated, the more varied the potential combinations. Thus, matrix training requires arranging the desired elements to control responding and considering which combinations to teach and which to probe.

In practice, as time and training resources are often scarce (Cook & Odom, 2013; Odom et al., 2010), the efficiency of interventions is paramount. Matrix training may be desirable as it does not necessitate changes to standard teaching procedures. For example, for the trained cells, researchers have used least-to-most prompting (Wilson et al., 2017), time delayed most-to-least prompting (Pauwels et al., 2015), and video modeling (Kinney et al., 2003). In addition, matrix training has been deployed across curricular areas, such as following action–object instructions in a play context (Wilson et al., 2017), tacting with prepositions (Pauwels et al., 2015), and spelling (Kinney et al., 2003). This diversity of curricular areas and operants suggests flexible use in practice.

The purpose of this tutorial is to assist practitioners in using matrix training. The research-to-practice gap is a substantial barrier to improving outcomes in real world contexts (Carnine, 1997), even in scientist-practitioner models such as ABA. As efficacious procedures are identified in the literature, resources must be developed to support practitioners in customizing them to meet the needs of their clients. This tutorial was designed for master’s-level practitioners who are competent with skills described on the Behavior Analysis Certification Board (BACB) Task List, such as assessment of current skills, discrimination training, selecting interventions based on client-specific variables, and interpreting data to support decision making. We describe several considerations for using matrix training, including (1) selecting the curricular area; (2) designing matrices; (3) developing training conditions; (4) evaluating results; and (5) implementing on a wider scale. For each consideration, we provide clarifying questions designed to prompt more nuanced decision making related to the design of matrix interventions. These considerations and questions are outlined in Fig. 3.

Fig. 3 Flowchart for matrix planning

Considerations When Selecting the Curricular Area

Matrix training has been efficacious across a wide range of skills (Curiel et al., 2020a, b; Kemmerer et al., 2021). The following are four questions to consider (not necessarily in order) when selecting a curricular area to target using matrix training.

Are the Components of the Desired Response Controlled by Unique Environmental Stimuli?

Responses targeted in a matrix must consist of two or more components, each controlled by a unique environmental stimulus. For example, Curiel, Curiel, and Li (2020b) taught adults with disabilities to tact various times on a clock. When tacting time, the hour stimulus controls the hour response, and the minute stimulus controls the minute response. As a nonexample, when tacting a “firefighter,” the responses “fire” and “fighter” are not controlled by distinct elements of a firefighter. Both responses (i.e., “fire” and “fighter”) are evoked as a single unit under the control of the stimulus. Examples of combinations with unique environmental stimuli are object–action (e.g., bear jumping), color–object (e.g., red car), and preposition–object (e.g., under tree). See Fig. 1 for several example matrices.

Can (and should) the Components of the Desired Responses be Flexibly Recombined?

Consider a 3 × 3 matrix with pick up, push, and throw (actions) on one axis and ball, cup, and stapler (objects) on the other axis. Picking up and pushing those three objects poses no problems, but throwing a stapler is dangerous. All combinations in a matrix “need to work” in that sense. For another example, if targeting cutting, all objects on the other axis must be able to be cut (e.g., paper, playdoh). It is critical to ensure each component on one axis can and should interact with all the components on the other axis.

It is also critical to consider whether the target response can be emitted in a similar manner when recombined. Reading in English is a prime example (Mueller et al., 2000). Consider arranging the initial sounds for l and n on one axis and the final sounds of -ow and -et on the other axis. Recombinative generalization with “now” after training “low” is faulty because the client would incorrectly pronounce “now” as “no” (like “low”). It may be that the value of learning recombinations, even if nonsensical, outweighs the relevance to the terminal skill. For example, color and animal combinations may be taught because they are both topics of interest to the client, despite the fact that animals do not come in every shade of the rainbow.

In addition, the age of the client should be considered. For a 20-year-old in a vocational training program, the focus may be exclusively on daily meal preparation actions (e.g., pour milk/oil/batter, scoop flour/sugar/baking powder). As an alternative, for a 3-year-old in early intervention, the team may target silly object–action combinations (e.g., a dog reading, a bunny writing). After drafting a matrix, it is important to review each combination and check for internal consistency and practical relevance.

Does the Learner Demonstrate Prerequisites for the To‑Be‑Combined Skill?

When developing a matrix, consider the client’s skills within the targeted operant and the complexity of the skills to be trained and probed. For example, prior to their study, Axe and Sainato (2010) confirmed that the participants could exhibit listener responding with some pictures but not with the experimental pictures (i.e., prerequisites within the targeted operant). If a client has made limited progress within the targeted operant, there may be barriers to learning, and the skill may be best addressed using a more direct intervention. For example, ensure a client can follow one-step instructions before targeting two-step instructions, the sequence reflected in the VB-MAPP (Sundberg, 2008) and typical child development. If acquiring one-step instructions was lengthy and labor intensive, assume similar barriers when targeting combinations of these skills. We recommend taking time to address these barriers with simpler skills before targeting combinations.

We also recommend ensuring the target response can be successfully occasioned by some form of controlling prompt (Wolery et al., 1992). In essence, does the target response occur presently but under control of different antecedent variables? For example, if targeting two-word tacts, evaluate two-word echoics. If the client requires extensive shaping to emit an acceptable approximation as an echoic, assume similar barriers when targeting tacts. However, if an acceptable approximation can be occasioned as an echoic, all that must occur is transfer of stimulus control to the dictated verbal stimuli. If targeting two-step instructions, evaluate two-step imitation. These considerations are not unique to matrix training but may be of particular importance because the aim of matrix training is developing more refined stimulus control (Frampton et al., 2019). Starting with behaviors that can be reliably occasioned by controlling prompts may narrow the potential sources of error once matrix training is underway.

What Materials Will Be Used?

A final consideration in the area of content selection is related to practicality. When selecting materials for matrix training, consider the exponential increases in the number of targets with each added component. For example, Frampton et al. (2019) created three 3 × 3 matrices for a total of 27 targets. Each card had to be created, laminated for durability, strategically stored, and replaced when lost. The materials were used with six clients, making the time and resource investment worthwhile. However, those same efforts for one client may not be possible depending on the setting and the practitioner’s caseload.

Whenever possible, we recommend using existing materials. Using materials from the individual’s typical learning environment may promote generalization (Stokes & Baer, 1977) and save time on material creation. Curiel and Curiel (2021) used bills and coins to teach listener responding with sums of money. As not every dollar/coin combination was evaluated, the investigation required a simple array of four bills and four coins. Marya et al. (2021) used animal figurines and accessory items from the clinical space to illustrate object–action targets. This was likely faster than finding or creating pictures of each animal engaging in each action. Using three-dimensional stimuli for illustrating actions may also be advantageous as not all actions can be easily represented in a two-dimensional format.

Videos may be another means to effectively illustrate actions. As organization of videos is important, Kohler and Malott (2014) embedded 162 videos into PowerPoint slides for ease of locating and to preclude needing to scroll through video files on a tablet or other device. A consideration for selecting actors is that if they are known, the client may tact them by name (e.g., “Steve eats cake”). However, if another client does not know Steve, rerecording the videos with known actors may be more time efficient than teaching the new tacts of the actors. Finally, we recommend using easily adaptable materials whenever possible. Consider the ease of writing varying digital times on a whiteboard in comparison to the time needed to create the 720 hour–minute combinations on index cards or PowerPoint slides.

Considerations When Designing Matrices

Recent reviews have highlighted the diversity in matrix sizes and components across the literature (Curiel et al., 2020a, b; Kemmerer et al., 2021). Two-dimensional matrices are most common, ranging in size from 2 × 3 to 12 × 12. These ranges suggest that there are many choices and that no format or size has been established as the best practice. The following are three questions to consider when designing matrices.

How Many Dimensions Will Be Included?

The number of dimensions should be based on how many active variables one is aiming to manipulate. If a skill (e.g., agent-verb tacts) is already established at strength, continued manipulation of these combinations may not be necessary in subsequent matrices. The established components could be grouped as a single component of the matrix and manipulated with respect to a new variable of interest (e.g., adverbs). Thus, rather than creating a three-dimensional matrix (e.g., agent–verb–adverb), a two-dimensional matrix could be formed (e.g., agent + verb–adverb) with fewer permutations (see Fig. 4). This may permit closer evaluation of the most critical, new aspect of the response and keep the complexity of the matrix from accelerating too quickly. We recommend using the least needed dimensions to evaluate the development of strong stimulus control for the included components.

Fig. 4 Options for three dimensional matrices

Although designing two-dimensional matrices is fairly straightforward, three-dimensional matrices involve rapid increases in combinations with each additional component (e.g., 3 × 3 × 3 = 27; 3 × 3 × 4 = 36).

Matrices need not have the same number of components on each axis, such as a matrix with three objects, five prepositions, and six locations (Goldstein & Mousetis, 1989). Although not evaluated in research, matrices could include many (e.g., five) dimensions to capture more sophisticated skills, such as agent–verb–adverb–preposition–location (e.g., the cat walked quickly to the fence; the dog jumped eagerly on the couch). As the matrix size increases, consider the sequence of the trained responses, and check for alignment with established grammatical structures. Practitioners interested in teaching skills of this level of complexity may consider packaged interventions aimed at teaching grammatical skills in a generative manner, such as Language for Learning (Engelmann & Osborn, 2008).

How Many Components Per Dimension Will be Included?

When determining the number of components per dimension, consider whether the purpose of the intervention is content- or cusp-oriented. If the purpose is to efficiently teach content of educational or functional relevance to the client, all combinations are important to teach. For example, a client has not mastered telling time if they can respond only to quarter increments (e.g., 1:15; 2:30; 3:45). A practitioner may choose to target only certain increments to build an initial foundation of success (see Curiel & Curiel, 2021; Curiel et al., 2020b). But eventually the client should be able to respond to any combination to truly master the content.

If the purpose is to establish a behavioral cusp, it may be necessary to evaluate only a subset of components. With a cusp orientation, the taught combinations are less critical than the end goal of establishing fluent recombinative generalization. For example, Frampton et al. (2016) and Marya et al. (2021) evaluated a subset of agent–action tacts in 3 × 3 matrices. The evaluations continued until the participants could emit recombined responses from the trained matrices, as well as combined responses to targets from an untrained matrix. Performance of this nature is suggestive of an atomic repertoire (Palmer, 2012), such that any permutation of known components may be established. This approach precludes the need to teach every combination (Goldstein et al., 1987). As the client masters new components (e.g., more objects and actions), they may be recombined with all previously learned components. In this case, teaching should continue until evidence of this fluid and flexible repertoire of recombination is demonstrated across a variety of new targets.

Will the Effects of Instruction be Assessed Beyond the Training Matrix?

Several studies have assessed the effects of matrix training across additional exemplars and operants (Kemmerer et al., 2021). Researchers have used a generalization matrix in which known components are evaluated in combinations with trained or other known components (Fig. 5). For example, Axe and Sainato (2010) evaluated combinations of trained preliteracy skills and known pictures, letters, and numbers. Frampton et al. (2016, 2019) evaluated combinations of known components with other known components similar to those trained in the initial matrix. Performance on generalization matrices indicates the effects of the intervention on skills of the same operant and level of complexity without additional training. Components in generalization matrices must be known or trained.

Fig. 5 Variations of matrix designs

Additional types of generalization are possible in matrix training. After training multiple combinations, testing for stimulus generalization is important (LaFrance & Tarbox, 2020; Stokes & Baer, 1977), such as across instructors and materials (Goldstein & Mousetis, 1989) and across novel peers and settings (Hatzenbuhler et al., 2019). If training is conducted in a tightly controlled environment (i.e., structured teaching session at a table or desk), it is important to demonstrate the effects with caregivers and peers in real-world, meaningful contexts. In addition, Goldstein and Mousetis (1989) taught tacts and tested for the emergence of listener responses, and vice versa. This type of testing in the opposite modality is reasonable given findings from the bidirectional naming literature (Horne & Lowe, 1996; Miguel, 2016). Achieving these types of generalization will further enhance the efficiency of matrix training.

Considerations for Selecting Training and Testing Arrangements

We recommend conceptualizing “matrix training” as “matrix planning” because the effects are based on planning the sequence of training and probing, rather than the training procedures (e.g., time delay, most-to-least prompting). We share two considerations for deciding which targets will be trained and which will be probed.

What Variation of Matrix Training will be Used?

The decision of which variation of matrix training to use (i.e., what to train and what to probe) should be linked to the status of the component skills as known or unknown. The prevailing recommendation is that with known components, nonoverlap training (i.e., “diagonal training”) may be sufficient (Curiel et al., 2020a, b; Kemmerer et al., 2021; Pauwels et al., 2015). In nonoverlap training, the targets along the diagonal of a matrix are trained (see Fig. 2). Overall, these targets include one component on each axis trained with one component on the other axis (or axes). As each component is trained with only one target, the trained combinations do not overlap across multiple components on other axes.

Inclusion of known components may be the most conservative approach. Using known components ensures some prerequisite skills for learning the combinations. Progression from simple (i.e., component) to complex (i.e., combination) skills aligns with typical developmental pathways. For example, children may consistently tact with single words before tacting with combinations of words (Brown, 1973). The VB-MAPP (Sundberg, 2008) reflects this progression, such as actions in Listener Milestone 8 and noun–verb actions in Milestone 9.

On the other hand, when unknown components are used or mixed with known targets, overlap training may be required (Curiel et al., 2020a, b; Kemmerer et al., 2021), in which a minimum of two targets are trained per component (see Fig. 2). Trained combinations are those on the diagonal and those to the right of each diagonal combination to create a stair-step pattern (also known as “stepwise training”). As each component is taught across at least two targets, there is overlap within and across components.

Using unknown components may be more efficient than using known components. Two types of skills are taught in matrix training: (1) the components and (2) the combinations (Goldstein, 1983). For example, Axe and Sainato (2010) used unknown components, and when they taught each diagonal skill (e.g., “underline the pepper”), they essentially taught three skills: the listener response of “underline” (action), the listener response of “pepper” (object), and the combination of performing the action with the object. On the other hand, Frampton et al. (2016) used known components as the participants could tact “dog” and “jumping” but not the combination, “dog jumping.” With known components, one skill is trained: the combination. With unknown components, three (or more) skills are trained: the component from each axis and the combination.

Learning a combination may entail learning an autoclitic frame (i.e., grammatical structure, word-order rule; Skinner, 1957), as it is correct to say, “dog jumping” but not “dogging jump” or “jump dogging.” This learning is heightened with three-dimensional matrices, such as the autoclitics involved in object–preposition–location (Goldstein & Mousetis, 1989) or agent–action–object (Kohler & Malott, 2014). When the components are known, training only one cell may be needed to learn the autoclitic frame/combination (Goldstein et al., 1987). A final consideration is that although nonoverlap training with unknown components may be most efficient, training unknown components in combination may take more trials to criterion than training known components in combination or training unknown components in isolation (Bergmann et al., 2022).

How will Training be Sequenced?

Combinations may be trained all at once (i.e., simultaneously), a few at a time (i.e., sequentially), or some combination of the two. Hatzenbuhler et al. (2019) trained all four play actions simultaneously. Curiel and Curiel (2021) trained targets 1 and 2 simultaneously, then 3 and 4 simultaneously, then 1–4 in mixed training, then 5 and 6, then 1–6 in mixed training. This type of sequence was most efficacious when teaching algebra to college students (Mayfield & Chase, 2002). As an alternative, several studies trained sequentially by dividing matrices into submatrices (e.g., Axe & Sainato, 2010; Jimenez-Gomez et al., 2019). For example, Axe and Sainato (2010) taught submatrices 1 and 2 simultaneously and 3 and 4 sequentially; each submatrix entailed training and probing for recombinative generalization. Decisions related to training sequences may depend on the size of the matrix. If using a small matrix (e.g., 2 × 2, 3 × 3), simultaneous training may be optimal as it promotes conditional discriminations (Grow et al., 2011, 2014). However, with a larger matrix, simultaneous training may be cumbersome, and submatrices and/or sequential training may be indicated.

Considerations for Evaluating Results

Kemmerer et al. (2021) noted variations in which probes for untrained targets interacted with the training conditions. In some studies, probes were conducted only after completing all training conditions (i.e., a posttest; Curiel & Curiel, 2021; Frampton et al., 2016, 2019; Jimenez-Gomez et al., 2019; Marya et al., 2021; Naio et al., 2006). In other studies, probes were conducted over the course of training (e.g., Axe & Sainato, 2010; Curiel et al., 2016, 2018). Decisions of when and what to probe may be influenced by several factors, detailed in the following three considerations.

When will Probes be Conducted in Relation to Training?

Administering one posttest for a large matrix may be likened to the “train and hope” strategy Stokes and Baer (1977) cautioned against. In other words, this is using summative assessment rather than formative assessment (Fuchs et al., 1993). The number of trials needed for mastery may depend on the duration between pretest and posttest (Fuller & Fienup, 2018). Long time lapses may weaken stimulus control for the trained targets, which may undermine the success of the intervention. Long time lapses also delay remedial procedures if optimal results are not obtained. On the other hand, if there is no recombinative generalization in a small matrix or submatrix (see Fig. 5), additional targets may be trained to achieve recombinative generalization before progressing to later submatrices. In an ideal situation, less training will be required across submatrices (see Trey and Rex’s performance in Axe & Sainato, 2010, for an example) consistent with the concept of learning set (Saunders & Spradlin, 1993).

The frequency of probes may depend on the matrix size as the larger the number of components, the greater number of probed combinations. With large matrices, a random selection of untrained combinations may be probed. A sample of instances of recombinative generalization should be representative of others. Thus, targets 1B, 2C, and 3A may be probed in Session 1, and targets 1C, 2A, and 3B probed in Session 2 (see Fig. 5). This approach saves time spent on probing, while allowing an ongoing assessment of the target outcome.

What Contingencies for Responding will be Used During Probe Conditions?

Researchers have to tightly control the procedures to be able to make firm conclusions that the independent variable, and nothing else, produced the change in the dependent variable. Therefore, many matrix training studies utilized extinction during the probes (e.g., Curiel & Curiel, 2021; Curiel et al., 2020a, b; Frampton et al., 2016, 2019; Kohler & Malott, 2014; Marya et al., 2021; Solano et al., 2021). According to Stokes and Baer (1977), to claim results are attributable to generalization, one must demonstrate that, “no extratraining manipulations are needed for extratraining changes” (p. 350). Reinforcement may be considered a form of manipulation, thus applications of the training contingencies to untrained targets undermines the analysis of these responses as a product of generalization.

However, use of lean or extinction schedules of reinforcement may also weaken responding. For example, in the study by Frampton et al. (2019), two participants (George and Tony) demonstrated mastery-level recombinative generalization during initial posttraining probes. As the probes continued under extinction, rates of responding decreased, all the way to 0% in some instances. Had reinforcement been provided in the probes, these participants may not have required remedial training sessions, saving instructional time.

Fortunately, such tight control is not needed in practice, and desired behaviors should be reinforced. A consideration for selecting contingencies during probes is examining the schedules of reinforcement in the “natural environment” (Stokes & Baer, 1977). If the target skills are intended to occur in a lean-reinforcement contexts (e.g., taking a spelling test), use of extinction or a lean schedule of reinforcement may be appropriate. As an alternative, if the target skills are intended to occur in the context of an enriched play activity, asking a parent to withhold reinforcement following the first occurrence of a skill they have been working on for weeks may be unacceptable. An approach for transitioning from reinforcement (during training) to extinction (during probes) is schedule thinning (Solano et al., 2021).

How will Results be Analyzed?

Analyzing results requires defining the target behaviors, measurement systems, and visual analysis procedures (e.g., AB design, multiple baseline design). When determining parameters for mastery, decide if the purpose of the matrix is to establish content (probe all combinations) or a cusp (probe a sample of combinations). In addition, consider the likelihood of chance-responding. If evaluating an auditory–visual conditional discrimination skill with a field of three, a client may be correct by chance in one out of three opportunities. Thus, multiple evaluations may be needed to increase confidence correct selections were not due to chance. Chance responding is less of a concern with tacting.

When evaluating the effects of matrix training, it may be useful to mix probes for trained and untrained targets to permit the strongest analysis of obtained results. Figure 6 details potential results and suggested remedial or future steps. If both the trained and untrained targets occur at low levels, conditions within the probe session may be to blame (i.e., overall extinction effect, low motivation). The training could be repeated or enhanced, or the probe conditions could be modified to support performance (e.g., Frampton et al., 2019). Should the trained targets occur at strength and the untrained targets at low levels, additional targets may be trained (e.g., Pauwels et al., 2015). If correct trained and untrained responding occurs across targets in an initial matrix but not in a generalization matrix, training across matrices may be effective (e.g., Frampton et al., 2016; Marya et al., 2021). If correct responding is observed across matrices, more targets, including across operant classes, may be trained and probed (e.g., Axe & Sainato, 2010; Curiel et al., 2016; Goldstein & Mousetis, 1989).

Fig. 6 Troubleshooting steps across outcomes. Steps progress from left to right

Considerations for Implementation on a Wider Scale

Matrix training has the potential to enhance the overall efficiency of clinical programming on both a small and large scale. As matrix training lies in the planning, time and energy may be saved by pooling resources and developing shared matrix-based protocols. This consideration should not diminish the obligation to meet each client’s unique needs. Rather, dissemination on a wide scale should encourage personalization where appropriate while hardwiring the effective mechanisms, such as disseminating predeveloped matrices that require only the insertion of client-specific targets (see Fig. 5, Supplemental Materials 1 and 2). In the end, the time to switch out an object (e.g., bird) for a favorite character (e.g., Donatello) within a predeveloped matrix would require less time and effort than developing a new matrix from scratch. The following two considerations may assist in leveraging the power of shared program banks to accelerate the use of matrix training.

How will Matrix Programs be Stored and Shared?

The efficiency of matrix training may be enhanced if deployed within an organization’s shared protocol or data collection systems. In the simplest form, template data sheets (see Supplemental Material 1) may be developed and shared that connect to template matrices (Fig. 5). One could use the answers from this tutorial to generate their matrix and adjust the templates based on the client. The data sheets should be designed to highlight critical steps, such as identifying the targets to be trained and the flow of procedures (e.g., probe Matrix 1 then Generalization Matrix). Embedded prompts and cues may support the design and implementation of matrices. These data sheets may be supplemented with template protocols (see Supplemental Material) with key procedures noted and left blank to tailor to clients’ unique needs. Template protocols may reduce the time spent writing and developing documents to allow rapid transition into intervention.

Matrix training may also be hardwired into electronic program banks such as CentralReach ©. Programs can be designed to hardwire matrix elements into curricular areas that fit with matrix training (e.g., time, money, play, following instructions, tacting). Several suggestions for accomplishing this are included in Supplemental Material 2. Trained targets may be sequenced as initially on the diagonal, then with overlapping targets, and then with the remainder of the untrained targets. This can be done in CentralReach © with each target as a child branch within a matrix folder or within a task analysis with each step serving as a target. Depending on the method of target building (as child branches or a task analysis), various auto-progression features may be used to move from intervention to probes or manually move targets between phases.

How will Matrix Materials be Stored and Shared?

Within organizations, it may be useful to develop shared material banks. Videos may be created and stored across various actor–action combinations. Stimulus cards, PowerPoints ©, or Boom Cards can be created with all the needed color–shape, hour–minute, dollar–cent combinations. Depending on the organization’s size, storage may be purely electronic within a secure file-sharing system or uploaded as a resource within CentralReach ©. Transitioning to online libraries allows behavior analysts in California to benefit from the efforts of their colleagues in Georgia. Regardless of the methods used, clear organization and labeling of stimuli are necessary to ensure quick and easy location of the desired materials. In addition, sufficient exemplars should be created to support flexible programming driven by client preferences and culture. If a client’s favorite colors are teal and violet but all the color–shape stimuli are primary colors, using the available materials misses a chance to infuse a program with preferred materials. If the available holiday materials reflect only one faith tradition, practitioners must take extra steps to ensure the client’s cultural practices are represented. Creating and sharing larger libraries of materials will reduce the response effort on individual practitioners and promote more inclusive practices on a large scale. Overall, these efforts to create and organize materials will streamline “matrix planning” and matrix training.

Conclusions

Matrix training is a system of planning two-component (or more) responses to teach and probe in which probed responses may emerge based on the concept of recombinative generalization. Carefully arranging learning targets into a matrix and following the considerations we have outlined will translate into efficient instruction and learning. As illustrated throughout this tutorial, the efficacy of the matrix approach lies in the planning and design. By attending to these considerations and related questions, we hope practitioners will feel increased confidence deploying this approach to benefit their clients.

To summarize the key points, matrix training is not appropriate for all skills; rather, the skills need to be twocomponent (or more) responses where each component is controlled by a stimulus or part of a stimulus. When arranging matrices, the combinations “need to work” in that each component may be combined with all components on the other axis. As emitting two-component responses can be challenging, one-component responses should be strong in a client’s repertoire. Two guiding principles for selecting materials for matrix training are practicality and programming for stimuli in the natural environment. The number of dimensions of a matrix (e.g., two-dimensional, three-dimensional) should be based on the client’s repertoire and goals. The number of components on each axis may be based on teaching content (teach all components and combinations) or a cusp (can sample some components and combinations).

There are two main arrangements of what to teach and what to probe: nonoverlap/diagonal and overlap/stepwise. A common recommendation is that if components are known, the nonoverlap/diagonal method may be used; if the components are unknown, the overlap/stepwise method may be needed. Matrix training with unknown components may be most efficient as both the components and combinations are trained, though this is not always the case. Simultaneous training may be most efficient but also challenging for learners; sequential training, perhaps with submatrices, may improve outcomes. In addition, the components and combinations learned in one matrix may be learned in additional, generalization matrices.

We recommend probing untrained targets throughout the course of a matrix to determine if recombinative generalization is occurring or if remedial strategies, such as training additional combinations, are needed. Even though untrained responses are often probed in extinction in research, reinforcing untrained/probed responses is recommended in practice. Finally, implement matrix training on a wide scale by using template data sheets stored in electronic program banks.

In conclusion, a teaching approach that requires substantial planning time is not necessarily more efficient than other approaches. We hope the tools and resources provided in this tutorial assist practitioners in implementing matrix training effectively and efficiently.

Supplementary Information The online version contains supplementary material available at https://doi .org /10 .1007 /s40617 –022 -00733-5.

Acknowledgments We thank Dr. Bill Heward and Dr. Gretchen

Dittrich for encouraging us to write this article.

Declarations

Conflicts of Interest We have no conflicts of interest relevant to this article to disclose.

References

Axe, J. B., & Sainato, D. M. (2010). Matrix training of preliteracy skills with preschoolers with autism. Journal of Applied Behavior Analysis, 43(4), 635–652. https://doi.org/10.1901/jaba.2010.43-635

Bergmann, S., Van Den Elzen, G., Kodak, T., Niland, H., & Dawson, D. (2022). Comparing matrix training procedures for children with autism spectrum disorder. Analysis of Verbal Behavior, 38(1), 24-53.

Brown, R. (1973). A first language: The early stages. Allen & Unwin.

Carnine, D. (1997). Bridging the research-to-practice gap. Exceptional Children, 63(4), 513–521.

Cook, B. G., & Odom, S. L. (2013). Evidence-based practices and implementation science in special education. Exceptional Children, 79(2), 135–144. https://doi.org/10.1177/001440291307900201

Curiel, E. S., & Curiel, H. (2021). Teaching receptive money identification skills using matrix training: A preliminary investigation. Behavioral Interventions, 36(3), 572–582. https://doi.org/10.1002/bin.1794

Curiel, E. S., Sainato, D. M., & Goldstein, H. (2016). Matrix training of receptive language skills with a toddler with autism spectrum disorder: A case study. Education and Treatment of Children, 39(1), 95–109.

Curiel, E. S., Sainato, D. M., & Goldstein, H. (2018). Matrix training for toddlers with autism spectrum disorder and other language delays. Journal of Early Intervention, 40(3), 268–284. https://doi.org/10.1177/1053815118788060

Curiel, E. S., Axe, J. B., Sainato, D. M., & Goldstein, H. (2020a). Systematic review of matrix training for individuals with autism spectrum disorder. Focus on Autism & Other Developmental Disabilities, 35(1), 55–64. https://doi.org/10.1177/1088357619881216

Curiel, E. S., Curiel, H., & Li, A. (2020b). Generative time telling in adults with disabilities: A matrix training approach. Behavioral Interventions, 35(2), 295–305. https://doi.org/10.1002/bin.1714

Engelmann, S., & Osborn, J. (2008). Language for learning. Science Research Associates.

Frampton, S. E., Wymer, S. C., Hansen, B., & Shillingsburg, M. A. (2016). The use of matrix training to promote generative language with children with autism. Journal of Applied Behavior Analysis, 49(4), 869–883. https://doi.org/10.1002/jaba.340

Frampton, S. E., Thompson, T. M., Bartlett, B. L., Hansen, B., & Shillingsburg, M. A. (2019). The use of matrix training to teach color shape tacts to children with autism. Behavior Analysis in Practice, 12(2), 320–330. https://doi.org/10.1007/s40617-018-00288-4

Fuchs, L. S., Fuchs, D., Hamlett, C. L., Walz, L., & Germann, G. (1993). Formative evaluation of academic progress: How much growth can we expect? School Psychology Review, 22, 27–48. https://doi.org/10.1080/02796015.1993.12085636

Fuller, J. L., & Fienup, D. M. (2018). A preliminary analysis of mastery criterion level: Effects on response maintenance. Behavior Analysis in Practice, 11(1), 1–8. https://doi.org/10.1007/s40617-017-0201-0

Goldstein, H. (1983). Recombinative generalization: Relationships between environmental conditions and the linguistic repertoires of language learners. Analysis and Intervention in Developmental 344 Behavior Analysis in Practice (2023) 16:334–345 Disabilities, 3(4), 279–293. https://doi.org/10.1016/0270-4684(83)90002-2

Goldstein, H., & Mousetis, L. (1989). Generalized language learning by children with severe mental retardation: Effects of peers’ expressive modeling. Journal of Applied Behavior Analysis, 22(3), 245–259. https://doi.org/10.1901/jaba.1989.22-245

Goldstein, H., Angelo, D., & Mousetis, L. (1987). Acquisition and extension of syntactic repertoires by severely mentally retarded youth. Research in Developmental Disabilities, 8(4), 549–574. https://doi.org/10.1016/0891-4222(87)90054-0

Grow, L. L., Carr, J. E., Kodak, T. M., Jostad, C. M., & Kisamore, A. N. (2011). A comparison of methods for teaching receptive labeling to children with autism spectrum disorders. Journal of Applied Behavior Analysis, 44(3), 475–498. https://doi.org/10.1901/jaba.2011.44-475

Hatzenbuhler, E. G., Molteni, J. D., & Axe, J. B. (2019). Increasing play skills in children with autism spectrum disorder via peermediated matrix training. Education and Treatment of Children, 42(3), 295–319. https://doi.org/10.1353/etc.2019.0014

Horne, P. J., & Lowe, C. F. (1996). On the origins of naming and other symbolic behavior. Journal of the Experimental Analysis of Behavior, 65(1), 185–241. https://doi.org/10.1901/jeab.1996. 65-185

Jimenez-Gomez, C., Rajagopal, S., Nastri, R., & Chong, I. M. (2019). Matrix training for expanding the communication of toddlers and preschoolers with autism spectrum disorder. Behavior Analysis in Practice, 12(2), 375–386. https://doi.org/10.1007/s40617-019-00346-5

Kemmerer, A. R., Vladescu, J. C., Carrow, J. N., Sidener, T. M., & Deshais, M. A. (2021). A systematic review of the matrix training literature. Behavioral Interventions, 36(2), 473–495. https://doi.org/10.1002/bin.1780

Kinney, E. M., Vedora, J., & Stromer, R. (2003). Computer-presented video models to teach generative spelling to a child with an autism spectrum disorder. Journal of Positive Behavior Interventions, 5(1), 22–29. https://doi.org/10.1177/10983007030050010301

Kohler, K. T., & Malott, R. W. (2014). Matrix training and verbal generativity in children with autism. Analysis of Verbal Behavior, 30(2), 170–177. https://doi.org/10.1007/s40616-014-0016-9

LaFrance, D. L., & Tarbox, J. (2020). The importance of multiple exemplar instruction in the establishment of novel verbal behavior. Journal of Applied Behavior Analysis, 53(1), 10–24. https://doi.org/10.1002/jaba.611

LeBlanc, L. A., Miguel, C. F., Cummings, A. R., Goldsmith, T. R., & Carr, J. E. (2003). The effects of three stimulus-equivalence testing conditions on emergent US geography relations of children diagnosed with autism. Behavioral Interventions: Theory & Practice in Residential & Community-Based Clinical Programs, 18(4), 279–289. https://doi.org/10.1002/bin.144

Marya, V., Frampton, S., & Shillingsburg, A. (2021). Matrix training to teach tacts using speech generating devices: Replication and extension. Journal of Applied Behavior Analysis, 54(3), 1235–1250. https://doi.org/10.1002/jaba.819

Mayfield, K. H., & Chase, P. N. (2002). The effects of cumulative practice on mathematics problem solving. Journal of applied behavior analysis, 35(2), 105–123. https://doi.org/10.1901/jaba.2002.35-105

Miguel, C. F. (2016). Common and intraverbal bidirectional naming. Analysis of Verbal Behavior, 32(2), 125–138. https://doi.org/10.1007/s40616-016-0066-2

Mueller, M. M., Olmi, D. J., & Saunders, K. J. (2000). Recombinative generalization of within-syllable units in prereading children. Journal of Applied Behavior Analysis, 33(4), 515–531. https://doi.org/10.1901/jaba.2000.33-515

Naoi, N., Yokoyama, K., & Yamamoto, J. (2006). Matrix training for expressive and receptive two-word utterancesin children with autism. Japanese Journal of Special Education, 43, 505–518.

Odom, S. L., Collet-Klingenberg, L., Rogers, S. J., & Hatton, D. D. (2010). Evidence-based practices in interventions for children and youth with autism spectrum disorders. Preventing School Failure: Alternative Education for Children and Youth, 54(4), 275–282. https://doi.org/10.1080/10459881003785506

Palmer, D. C. (2012). The role of atomic repertoires in complex behavior. The Behavior Analyst, 35(1), 59–73. https://doi.org/10.1007/BF03392266

Pauwels, A. A., Ahearn, W. H., & Cohen, S. J. (2015). Recombinative generalization of tacts through matrix training with individuals with autism spectrum disorder. Analysis of Verbal Behavior, 31(2), 200–214. https://doi.org/10.1007/s40616-015-0038-y

Saaybi, S., AlArab, N., Hannoun, S., Saade, M., Tutunji, R., Zeeni, C., Shbarou, R., Hourani, R., & Boustany, R. M. (2019). Pre-and post-therapy assessment of clinical outcomes and white matter integrity in autism spectrum disorder: pilot study. Frontiers in Neurology, 10, 877.

Saunders, K. J., & Spradlin, J. E. (1993). Conditional discrimination in mentally retarded subjects: Programming acquisition and learning set. Journal of the Experimental Analysis of Behavior, 60(3), 571–585. https://doi.org/10.1901/jeab.1993.60-571

Skinner, B. F. (1957). Verbal behavior. Appleton-Century Crofts.

Solano, A. S., Reeve, S. A., Reeve, K. F., DeBar, R. M., Dickson, C. A., & Milata, E. M. (2021). Comparing matrix sizes when teaching direction following to preschoolers with autism spectrum disorder. Behavioral Interventions, 36(4), 778–795. https://doi.org/10.1002/bin.1824

Stokes, T. F., & Baer, D. M. (1977). An implicit technology of generalization. Journal of Applied Behavior Analysis, 10(2), 349–367. https://doi.org/10.1901/jaba.1977.10-349

Sundberg, M. L. (2008). VB-MAPP: Verbal Behavior Milestones Assessment and Placement Program. AVB Press.

Wilson, E. R., Wine, B., & Fitterer, K. (2017). An investigation of the matrix training approach to teach social play skills. Behavioral Interventions, 32(3), 278–284. https://doi.org/10.1002/bin.1473

Wolery, M., Holcombe, A., Cybriwsky, C., Doyle, P. M., Schuster, J. W., Ault, M. J., & Gast, D. L. (1992). Constant time delay with discrete responses: A review of effectiveness and demographic, procedural, and methodological parameters. Research in Developmental Disabilities, 13(3), 239–266. https://doi.org/10.1016/0891-4222(92)90028-5

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Comparison of Flexible Prompt Fading to Error Correction for Children with Autism Spectrum Disorder

Jan 5, 2026

Justin B. Leaf & Ronald Leaf & Mitchell Taubman & John McEachin & Lara Delmolino
Published online: 12 September 2013
© Springer Science+Business Media New York 2013

Abstract

This study compared flexible prompt fading to an error correction procedure involving feedback and remedial trials for teaching four children with Autism Spectrum Disorder. Using a parallel treatment design nested into a multiple probe design, researchers taught each participant how to expressively label six pictures of Muppet characters with the flexible prompt fading procedure and six pictures of Muppet characters with the error correction procedure. The researchers evaluated the effectiveness, maintenance, efficiency, and acquisition during teaching for each participant across the two teaching conditions. Results indicated that both teaching procedures were effective, resulted in high rates of maintenance, and that participants responded correctly during the majority of teaching trials. However, flexible prompt fading was more efficient in terms of total number of trials and sessions, as well as total amount of time for participants to learn all targeted skills.

Keywords: Autism . Discrete trial teaching . Error correction . Flexible prompt fading . Prompting

Author Note We wish to thank Jeremy A. Leaf, Christine Miline, Amy Lentel, Marlene Brown, and Amanda Kwok for their help running sessions throughout the study. We wish to thank Shelli Imfeld, Julie Stiglich, and Cliff Anderson for their help throughout the project. Finally, we wish to thank Misty L. Oppenheim-Leaf for her insight on previous versions of this manuscript.

J. B. Leaf : R. Leaf : M. Taubman : J. McEachin
Autism Partnership Foundation, Seal Beach, CA, USA

L. Delmolino
Rutgers University, New Brunswick, NJ, USA

J. B. Leaf (*)
200 Marina Drive, Seal Beach, CA 90740, USA
e-mail: Jblautpar@aol.com

Discrete trial teaching (DTT) is commonly implemented to help teach students diagnosed with an autism spectrum disorder (ASD) (Lovaas 1987; Smith 2001). The three main components of DTT are: (a) a discriminative stimulus (S^D) from the teacher; (b) a response by the student; and (c) a consequence provided by the teacher. Since students with ASD often need assistance from the teacher in order to display the correct response, an optional fourth step of DTT is prompting. Prompts can take many forms (known as prompt types). Prompt types can include: pointing to the correct response (e.g., Leaf et al. 2010), verbally stating the correct response (e.g., Leaf et al. 2011a, b), modeling the correct behavior (e.g., Bozkurt and Gursel 2005), reducing the number of choices (e.g., Soluaga et al. 2008), within-stimulus prompts (e.g., Schreibman 1975), or physically guiding the learner to the correct response or to engage in the correct behavior (e.g., Leaf et al. 2010).

When prompts are utilized, it is intended for the control of the response to be systematically transferred from the prompt to the intended S^D. Researchers have created prompting systems to help ensure that teachers provide prompts correctly, fade prompts appropriately, prevent unintended prompts, and avoid the student becoming dependent on teacher prompts. Today, there are several different prompting systems that have been evaluated in the literature and are implemented clinically to teach students diagnosed with ASD. These prompting systems include: time delay (e.g., Charlop and Trashowech 1991; Morse and Schuster 2000), least-to-most prompting (e.g., Tarbox et al. 2007), and most-to-least prompting (e.g., Bloh 2008).

One prompting system that has been clinically implemented with numerous students with ASD, but has limited empirical evidence as to its effectiveness, is flexible prompt fading (FPF). Flexible prompt fading was first described by Lovaas and colleagues during investigations at the UCLA Young Autism Project (Lovaas 1987) and has more recently been described by Leaf and McEachin (1999). Flexible prompt fading is a prompting technique that relies on the teacher using his or her clinical judgment (that is, making in-the-moment interventional decisions based on defined parameters) to decide whether or not to prompt a student and what type of prompt to implement. Thus, FPF is similar to graduated guidance (e.g., MacDuff et al. 1993; Wolery and Gast 1984), as it allows clinicians the freedom to prompt based on general guidelines rather than specific rules.

Although clinicians prompt student responses based on in-the-moment decisions, there are several important guidelines that the clinician follows when implementing FPF. In general, the clinician aims for the student to maintain a high level of success (e.g., 80 % correct with or without a prompt). Second, the clinician should provide a prompt if the student has had a recent history of errors on the task. If the student has had a long history with the task or has had a recent history of responding correctly, the clinician may elect to reduce the level of assistance or not provide a prompt at all. Ultimately, the clinician must determine whether the student is likely to make a correct response on the next trial, based on the criteria above, and prompt accordingly. If the student is likely to respond correctly, the teacher should provide a less intrusive prompt or not prompt; if the student is likely to respond incorrectly, the teacher should provide a prompt. Further guidelines of FPF have been described by Leaf and McEachin (1999).

The first study to evaluate flexible prompt fading was conducted by Soluaga et al. (2008). This study compared FPF to time delay for teaching various academic tasks to five children diagnosed with ASD. The FPF procedure consisted of the teacher implementing five different prompt types (i.e., physical, pointing, modeling, positioning prompts, and field reduction prompts) in a one-to-one instructional format.

During the implementation of the time delay procedure, however, only controlling prompts (i.e., the least intrusive prompt type that guarantees a correct response by the learner) were implemented. A modified parallel treatments design was utilized to compare the effectiveness of the two prompting procedures. Results of the study indicated that both prompting procedures were effective and there was mixed results in terms of efficiency.

Although prompting has been demonstrated to be an effective component of DTT, some clinicians may elect not to implement DTT with the provision of antecedent prompts. When a teacher does not provide an antecedent prompt (e.g., prompting prior to the student’s behavior) they are relying on either pure trial-and-error learning or on providing some type of explicit error correction procedure (Rodgers and Iwata 1991). Since trial-and-error procedures can have undesirable side effects, error correction procedures (EC) are more widely implemented. In EC, teachers provide reinforcement for correct responses and corrective feedback (e.g., “Nope, that is not it.”) followed by modeling the correct response (e.g., “This is an apple.”) for incorrect responses. Providing corrective feedback only, without modeling the correct response, for incorrect responses is an example of consequence-based EC that does not directly assist the student in identifying the correct response (i.e., trial-and-error). Providing the correct model in addition to the corrective feedback may increase the rate of learning (Smith et al. 2006). Finally, the teacher may implement another immediate unprompted opportunity for the student to display the appropriate behavior (i.e., remedial trial) after the model has been provided.

Researchers have found that error correction procedures can be effective in teaching a wide variety of skills including: verb usage (Schumaker and Sherman 1970), matching to sample tasks (Rodgers and Iwata 1991), and expressive labeling of sight words (Worsdell et al. 2005). In 2005, Worsdell and colleagues evaluated the effects of error correction procedures in teaching 11 adults with development disabilities to improve their ability to recognize sight words. Worsdell et al. demonstrated that error correction procedures utilizing a remedial trial after every incorrect response were effective in increasing sight word recognition.

Smith et al. (2006) compared three different teaching conditions for teaching matching words to pictures for six participants diagnosed with ASD. In the first condition, error statement, the teacher said “no” anytime the participants made an incorrect response (similar to corrective feedback stated above). In the second condition, modeling, the teacher stated the correct response anytime the participants made an incorrect response. In the third condition, the teacher provided no feedback (pure extinction) any time the participants made an incorrect response. Results of the study were idiosyncratic across participants in regards to acquisition rate. For four of the six subjects, EC was superior to no feedback. The other two participants were fast learners and performed as well in the no-feedback condition as they did in the EC condition. Of the four who performed better with EC, two did equally well with both EC methods, while one made fewer errors with the error statement and one made fewer errors with modeling of the correct response.

While the research to date has shown that a number of error correction procedures are effective in teaching new skills, many professionals still warn against teaching procedures that allow students to make errors (e.g., Gast 2011). Research has shown that under some circumstances errors can lead to more errors, students may display aberrant behaviors after making an error, and that error correction procedures may not be as effective as other prompting procedures (e.g., Ferster and DeMeyer 1962). Therefore, more direct comparison studies are warranted to provide further evidence about which procedures are most effective and efficient for teaching acquisition of new skills to children with ASD. Additionally, clinicians working with individuals diagnosed with ASD should implement the most effective and efficient procedures. Thus, the purpose of this study was to compare a consequence based procedure (i.e., an error correction procedure), which did not attempt to minimize participant errors, to an antecedent prompt procedure (i.e., a flexible prompt fading procedure), which attempted to minimize errors through the use of antecedent prompts. In doing so, we compared the effectiveness, maintenance, and efficiency of the two procedures in teaching expressive labeling to four high-functioning children diagnosed with ASD.

Method

Participants

Participants all had a formal diagnosis of autistic disorder from an outside agency, ranged in age from 4- to 6-years-old, and had an IQ score ranging from 86 to 128. Three of the four participants had a history of educational intervention that used a flexible prompt fading procedure. None of the participants had a history of error correction.

Rob was a 5-year-old boy independently diagnosed with autistic disorder. Rob had a Wechsler Preschool and Primary Scale of Intelligence-Third Edition (WPPSI-III) FSIQ score of 128, a Vineland-II Adaptive Behavior Scales Survey Interview Form (VABS-II) adaptive behavior score of 94, a Gilliam Autism Rating Scale (GARS-II) autism quotient of 98 (probability of autism very likely), and a PPVT-4 standard score of 123. Rob had received a mean of 20 h of behavioral treatment per week over the prior 18 months and was placed in a special education preschool classroom with supports.

Jimmy was a 4-year-old boy independently diagnosed with autistic disorder. Jimmy had a WPSSI-III FSIQ score of 86, a VABS-II adaptive behavior score of 81, a GARS-II autism quotient of 98 (probability of autism very likely), and a PPVT4 standard score of 100. Jimmy had received a mean of 39.5 h of behavioral treatment per week over the prior 24 months and was placed in a special education classroom without supports.

Billy was a 5-year-old boy independently diagnosed with autistic disorder. Billy had a WPPSI-III FSIQ score of 99, a VABS-II adaptive behavior score of 88, a GARS-II autism quotient of 89 (probability of autism very likely), and a PPVT-4 standard score of 117. Billy had received a mean of 13 h of behavioral treatment per week over the prior 21 months and was placed in a general education preschool classroom without supports.

Kenny was a 6-year-old boy independently diagnosed with autistic disorder. Kenny had a Stanford Binet–Fifth Edition FSIQ score of 88, an ADOS (Module 3) score meeting the Autism cut-off (communication and social interaction combined score of 17), and a PLS-4 standard score of 88 (percentile of 21 and age equivalent of 5 years 11 months). Kenny had received a mean of 25 h of behavioral treatment per week for a period of 24 months and was placed in an integrated preschool classroom with supports. He is the only participant without prior exposure to FPF.

Setting and Researchers

This study took place in two different settings. The setting for three of the participants (i.e., Rob, Jimmy, and Billy) was a small research room in a private behavior intervention agency’s Southern California office. The research room measured approximately 2.7 m by 2.7 m and contained a table, cabinets, chairs, couch, closets for research materials, and a desk. At this research site, there were three researchers who conducted the study on a daily basis. Two of the researchers had a Bachelor’s degree in psychology and one had a Master’s degree in education. Each researcher had received an initial intensive training lasting at least 2 months. The training consisted of both didactic and hands on training on various topics, such as: applied behavior analysis, autism, reinforcement, prompting, discrete trial teaching, error correction, and teaching interactions. After this initial training, each researcher had over 1-year direct experience working with individuals diagnosed with autism and implementing the procedures utilized in this study.

The second setting, for Kenny only, was a small research room in a New Jersey university that provides behavioral intervention for children and adults diagnosed with ASD. The research room contained a table, chairs, and file cabinets. On a few occasions, due to scheduling conflicts, sessions took place in an office or an unused classroom, both of which were familiar to the student. At this research site, there were two primary researchers who conducted the study on a daily basis. One of the researchers held a doctorate degree and one had a master’s degree pending at the time of the study. Both researchers had over 20 years of experience in the field and had received intensive training in applied behavior analysis, autism, reinforcement, prompting, discrete trial teaching, and error correction procedures. Both of the researchers also had extensive experience (over 10 years) utilizing the procedures in this study.

Skills Taught

Each participant was taught to expressively label the names of 12 pictures of Muppet© characters. The selection of these skills were made based upon each participant’s supervisors recommendation as they were teaching each participant pop culture knowledge, a skill that all participants needed. Additionally, these skills were not being targeted in each participant’s current clinical intervention, so skills would not be inadvertently taught. Character names were taught in pairs, and the stimulus pairs were randomly assigned to one of the two conditions prior to baseline. Table 1 shows the item pairs that were taught to each participant with each procedure.

General Procedure

The researchers conducted research sessions 3 to 5 days per week; only one research session occurred per day. The length of the sessions ranged from 5 to 25 min dependent upon the type of session (e.g., probe session only or probe session plus teaching sessions) and participant responding (e.g., more reinforcement breaks for correct responding). During some sessions, the participant only received probe trials (full probe sessions) to assess baseline levels for skills not yet taught and to assess maintenance levels for skills previously taught (see below). During the majority of research sessions, the researchers implemented daily probe trials to test for acquisition, a short break (approximately 2 min), one of the teaching conditions (i.e.., FPF or EC), another short break (approximately 2 min), and then the second teaching condition (e.g., the procedure that was not implemented first). The order of FPF and EC were randomly determined prior to each research session.

Table 1 Targeted skills

Participant First stimulus pair		First stimulus pair	Second stimulus pair	Second stimulus pair	Third stimulus pair	Third stimulus pair
	FPF	EC	FPF	EC	FPF	EC
Jimmy	Scooter & Honeydew	Beaker & Janice	Sweetums & Camilla	Rizzo & Sam	Floyd & Lew	Dr. Teeth & Animal
Rob	Beaker & Janice	Scooter & Honeydew	Lew & Sweetums	Rizzo & Sam	Dr. Teeth & Zoot	Camilla & Floyd
Billy	Beaker & Janice	Scooter & Honeydew	Rizzo & Pepe	Sweetums & Camailla	Dr. Teeth & Zoot	Floyd & Lew
Kenny	Fozzie & Woldorf	Sweetums & Camilla	Zoot & Lew	Rowlf & Floyd	Dr. Teeth & Statler	Rizzo& Sam

Each trial (probe trials and teaching trials) began by the researcher holding up one of the cards displaying a Muppet character in view of the participant. Next, the researcher gave an instruction to the participant to provide the name of the Muppet character (e.g., “What is his or her name?”), and allowed approximately 5 s for the participant to respond. During probe trials, no prompts, reinforcement, or feedback was provided to the participant; the researcher provided neutral feedback (e.g., “Thanks” or “Thank You”) regardless of the participant’s response (i.e., correct or incorrect response). During teaching trials, however, the researcher provided prompts, reinforcement, and feedback dependent upon the teaching condition being implemented (see below).

Prior to beginning intervention, potential tangible reinforcers (e.g., toys or edibles) were selected, which were used during full probe sessions, daily probe sessions, FPF teaching sessions, and EC teaching sessions. The researchers selected tangible reinforcers by observing the participant, asking the participant what he wanted to work for, or interviewing the participant’s teachers and/or parents. The researchers selected approximately 5 different tangible reinforcers for each participant. The reinforcers were held constant across both teaching conditions throughout the study.

Full Probe Sessions

The researchers conducted full probe sessions prior to the teaching of any new stimulus items to determine current baseline performance. Additionally, after the participant met mastery criterion (i.e., 100 % correct on all daily probe trials for 3 consecutive daily probes) on at least one stimulus pair, researchers administered a full probe session on all stimulus pairs to evaluate whether correct responding on previously taught pairs was maintained. The researchers evaluated all stimulus items four times each during full probe sessions and randomly determined the order for presentation during these sessions; thus, each full probe session consisted of 48 full probe trials. No reinforcement was provided to the participants contingent upon correct responding during full probe sessions. The researchers did provide reinforcement to participants on a fixed ratio schedule (FR3 or FR4) contingent upon the participant displaying appropriate behaviors (e.g., sitting in his or her chair and not engaging in any aberrant behaviors); the reinforcer provided was randomly selected. Daily Probe Sessions

The researchers conducted daily probes prior to each teaching session to evaluate whether participants were learning to correctly label the Muppet characters that were currently being taught to them. Daily probe trials were conducted in the same manner as full probe trials. The daily probe sessions consisted of 16 randomized probe trials; four probe trials were conducted for each target skill currently being taught (2 skills with EC and 2 skills with FPF). Mastery criterion was set at 100 % correct responding on all probe trials for a stimulus pair (i.e., 8 probe trials) across three consecutive daily probes. The researchers provided reinforcement to the participants for displaying appropriate behavior (e.g., sitting correctly) on an FR-4 schedule (similar to the reinforcement provided during full probe sessions). Once a participant met mastery criterion for a stimulus pair, teaching on that stimulus pair stopped; daily probes, however, were continued until at least three more daily probe sessions were completed or the second stimulus pair reached mastery criterion. Following daily probes, the researchers provided the participant with a brief 1 to 2-min break prior to beginning the first teaching session. During the first session in which new stimulus pairs were being taught no daily probe was implemented.

Teaching Session

Flexible Prompt Fading (FPF) A total of 20 teaching trials per session were implemented in this condition. The FPF condition started with the researcher placing a color mat (e.g., yellow mat) in front of the participant; the color mat indicated that the FPF condition was going to be implemented. In the FPF condition, a trial started with the researcher holding up a picture of the Muppet character in sight of the participant. Next, the researcher provided a discriminative stimulus and gave the participant 5 s to respond to the instruction. If the participant labeled the character correctly (prompted or unprompted) the researcher provided the participant with praise and brief access (approximately 5 s) to the reinforcer (described above). The researcher randomly selected one of the five reinforcers and provided it to the participant. Thus, the inter-trial interval was approximately 5 s following correct responses. If the participant labeled the character incorrectly, the researcher said “Nope, that’s not it” and moved to the next trial; the following trial could be a remedial trial or the researcher had the autonomy to move to the next predetermined trial. Thus, the intertrial interval was approximately 3 s following incorrect responses.

During the FPF condition, the researchers had the flexibility to provide antecedent prompts to help ensure that the participant maintained a high level of correct responding. Although the use of prompts during FPF is based primarily upon researcher judgment, those decisions were governed by several guidelines that the researchers were instructed to follow.

Most importantly, the researchers aimed to have the participant respond correctly (prompted or unprompted) on at least 80 % of trials. The researchers were instructed to assess prior to each trial whether or not the participant was likely to respond correctly. If the researcher determined that the participant was likely to respond correctly without a prompt, then the researcher did not provide a prompt to the participant. If the researcher determined that the participant was likely to respond incorrectly, then the researcher provided a prompt to the participant. In order to make this assessment, the researcher first looked at the previous responses of the participant. If the participant was responding correctly without prompts on previous trials, or if the researcher had prompted the participant on several previous trials, then the researcher could either reduce the level of assistance or not prompt at all. If the participant was responding incorrectly with a less assistive prompt then the researcher could provide a more intrusive prompt. Additionally, if the participant had many previous sessions with the target, the researcher may elect not to provide a prompt. Finally, the researcher assessed the participant’s current behaviors. If the participant’s tolerance for frustration was low, the researcher was more likely to provide a prompt.

Second, the researcher had the flexibility to implement multiple prompt types (e.g., verbal prompt, partial verbal prompt, model prompt) at his or her discretion. The researchers were guided to implement any prompt type that he or she thought would result in the participant responding correctly on any given trial. Thus, unlike other prompting systems (e.g., most-to-least prompting) where the researcher has to provide a given prompt at a given point the researcher had the discretion to provide any prompt type at any point. Furthermore, the researchers were instructed to fade prompts as quickly as possible to transfer stimulus control from the prompt to the instruction alone.

A third guideline was that the researcher had to provide all prompts directly after the instruction and prior to the participant engaging in a correct or incorrect response. This was different than the error correction procedure where the researcher provided instructional feedback following a participant’s incorrect response (see below).

Error Correction (EC) A total of 20 teaching trials per research session were implemented in this condition. The EC condition started with the researcher placing a color mat (e.g., red mat) in front of the participant; the color mat indicated that the EC condition was going to be implemented. In the EC condition a trial started with the researcher holding up a picture of the Muppet character in sight of the participant. Next, the researcher provided a discriminative stimulus (e.g., “What is his or her name?”) and gave the participant 5 s to respond to the instruction. If the participant labeled the character correctly, the researcher provided the participant with praise and provided one of the same reinforcers used in the FPF condition for approximately 5 s. After 5 s the researcher asked the participant to hand him or her back the toy and implemented the next planned teaching trial. Thus, the inter trial interval was approximately 5 s following correct responses.

If the participant incorrectly labeled the character or did not respond to the instruction within the 5 s, the researcher said, “No, that’s not it” followed by stating the correct name (e.g., “This is [character’s name].”) of the character. The participant was not required to imitate the modeled response. Instead, the researcher provided one remedial trial, providing the participant with the opportunity to demonstrate the correct response. The remedial trial started with the researcher holding up a picture of the Muppet character in sight of the participant. Next, the researcher provided a discriminative stimulus and gave the participant 5 s to respond to the instruction. If the participant labeled the character correctly, the researcher provided the participant with praise, but no tangible reinforcement. If the participant did not respond correctly or did not respond to the instruction within the 5 s, the researcher said, “No, that is not it” followed by stating the correct name of the character. Regardless of the outcome of the remedial trial, the researcher moved on to the next planned trial.

Dependent Variable and Data Collection

The primary measure was participants’ skill acquisition as measured by daily probe trials. The researchers measured how many stimulus pairs the participants mastered across the two teaching conditions. Mastery criterion was set as the participant responding 100 % correct for targets of a stimulus pair for three consecutive daily probe sessions. During all probe trials, the researcher recorded the response of the participant. A correct response was recorded if the participant correctly named the picture of the Muppet character within 5 s of the researcher’s instruction. An incorrect response was recorded if the participant incorrectly named the picture of the Muppet character within 5 s of the researcher’s instruction. A no-response was recorded if the participant did not give any response within 5 s of the researcher’s instruction.

The second measure was how well the participants maintained skills taught to them, which was assessed on full probe trials. During each full probe session the researchers recorded participant responding during each probe trial. As described above, the participants could respond correctly, incorrectly, or have no response.

The third measure was the relative efficiency of the two interventions. We measured the total number of teaching sessions, total number of trials, and total amount of teaching time required for each participant to master all of his targets across the two teaching conditions. Each research session consisted of 20 total teaching trials for FPF and 20 total teaching trials for EC. A teaching trial was defined as anytime the researcher presented an instruction for the participant to respond, regardless of whether the trial was prompted or not or was preceded by an error. Thus, each remedial trial in the EC condition was counted as a separate trial (i.e., 1 out of the 20 trials). A remedial trial in the EC condition was defined as anytime the participant made an incorrect response and the teacher re-presented the same targeted behavior on the next trial. Therefore, if a participant was incorrect on the first opportunity and was incorrect on a remedial trial, this was scored as two incorrect responses. Remedial trials in the FPF condition also counted as separate trials. A remedial trial in the FPF condition could occur at three times: (1) if the participant responded incorrectly and the teacher provided a follow-up trial of the same targeted response (similar to the EC condition); (2) if the participant responded correctly independently and the teacher decided to provide another opportunity to the learner to respond to the same target; and (3) if the participant responded correctly with the provision of a prompt and the teacher elected for the student to have an opportunity to respond correctly but without a prompt being provided.

The final measure captured the percentage of participant responses during teaching trials across the two conditions. A total of five response types were evaluated, which included: (1) overall correct trials without prompts (first opportunity and remedial trials); (2) correct trials without prompts on the first opportunity; (3) incorrect/ no response trials without prompts (first opportunity and remedial trials); (4) prompted correct trials; and (5) prompted incorrect/ no response trials.

Correct and incorrect/no response trials had the same operational definition as responses during probe trials. Prompted correct trials were scored if the researcher provided an antecedent prompt (e.g., verbally stating the correct response) and the participant correctly labeled the picture. Prompted incorrect trials were scored if the researcher provided an antecedent prompt (e.g., verbally stating the correct response) and the participant incorrectly labeled the picture. Prompted responses were never counted as correct or incorrect. Remedial trials were scored based upon the learners response.

Experimental Design

A parallel treatment design (Gast and Wolery 1988) nested in a multiple probe design across skill sets and replicated across participants was used to evaluate the effectiveness of the two prompting procedures. It is critical that when implementing a parallel treatment design, that the order of the two procedures are randomly determined ahead of time, which was done throughout this study. With a parallel treatment design, experimental control is established when one of the prompting procedures results in more rapid skill acquisition than the other prompting procedure. Since experimental control may be undermined if both procedures result in equal rates of acquisition, the additional use of the multiple probe design helps to ensure experimental control. With the multiple probe design the researcher implements the independent variable (i.e., the prompting systems) on one of the dependent variables (i.e., one of the stimulus sets) and does not intervene on the other independent variables until an increasing trend is shown. Thus, experimental control is established if learning occurs when, and only when, the intervention is implemented.

Interobserver Agreement

The researcher scored the participants responses during every session. A second observer (i.e., research assistant) simultaneously and independently recorded participant responses during 52.1 % (range, 41.6 % to 75 % across participants) of the full probe sessions, 48 % (range, 30 % to 71.4 % across participants) of the daily probe sessions, 58.1 % (range, 33 % to 100 % across participants) of the FPF sessions, and 62.5 % (range, 36 % to 100 % across participants) of the EC sessions; interobserver reliability was scored both in-vivo and by watching videotapes of the research sessions. Interobserver agreement was calculated by totaling the number of agreements (i.e., trials in which both observers scored the same response) divided by the number of agreements plus disagreements (i.e., trials in which the two observers scored a different participant response) and converting this ratio to a percentage. Percentage agreement across all participant responses was 99.5 % (range, 95.8 % to 100 % per session) for full probe trials, 98.9 % (range, 87.5 % to 100 % per session) for daily probe trials, 99.6 % (range, 95 % to 100 % per session) for FPF teaching trials, and 98.5 % (range, 90 % to 100 % per session) for EC teaching trials, summed across all four participants.

Treatment Fidelity

The researchers measured correct instructor behaviors during full probe sessions, daily probe sessions, flexible prompt fading trials, and error correction trials (contact author for treatment fidelity checklists). During full and daily probe trials, correct instructor behaviors included: (a) holding up the picture in the participant’s view; (b) delivering an instruction for the participant to name the Muppet character; (c) the researcher allowing approximately 5 s (e.g., plus or minus 1 s) for the participant to respond; and (d) providing the participant with neutral praise (e.g., “Thank You” or “Thanks”) regardless of the participant’s response.

During flexible prompt fading trials, correct instructor behaviors included: (a) holding up the picture in the participant’s view; (b) delivering an instruction for the participant to name the Muppet character; (c) the researcher allowing approximately 5 s (e.g., plus or minus 1 s) for the participant to respond; (d) the researcher providing reinforcement (i.e., social praise and a toy) only if the participant responded correctly; and (e) the researcher providing corrective feedback (i.e., “That’s not it”) only if the participant responded incorrectly. In addition to these correct teacher behaviors, we also analyzed whether or not the researcher(s) maintained a participant’s correct level of responding (correct or correct after the provision of a prompt) at 80 % or above (e.g., the most important guideline of FPF).

Correct instructor behaviors measured for error correction were: (a) holding up the picture in the participant’s view; (b) delivering an instruction for the participant to name the Muppet character; (c) the researcher allowing approximately 5 s (e.g., plus or minus 1 s) for the participant to respond; (d) the researcher providing reinforcement (i.e., social praise and a toy) only if the participant responded correctly; (e) the researcher providing corrective feedback (i.e., “That’s not it”) only if the participant responded incorrectly; (f) the teacher providing informative feedback only after an incorrect response; and (g) the researcher providing a remedial trial if the participant responded incorrectly on the first opportunity to respond independently.

To assess treatment fidelity, an independent observer (e.g., research assistant) recorded the researchers behaviors during 37.5 % (range, 33.3 % to 41.6 % across participants) of full probe sessions, 34 % (range, 30 % to 35.7 % across participants) of daily probe sessions, 53.4 % (range, 33 % to 100 % across participants) of the flexible prompt fading sessions, and 55.1 % (range, 35.7 % to 100 % across participants) of error correction sessions. The observer reported that the researcher engaged in correct instructor behaviors on 99.2 % (range, 94 % to 100 %, across sessions) of full probe trials; 99.6 % (range, 94 % to 100 % across sessions) of daily probe trials; 98 % (range, 80 % to 100 % across sessions) of flexible prompt fading trials; and 97.4 % (range, 80 % to 100 % across sessions) of error correction trials. Additionally, the researchers maintained participant responding above 80 % during 100 % of flexible prompt fading sessions in which treatment fidelity was taken.

Results

Skill Acquisition, Mastery Criterion, and Maintenance

The researchers taught Jimmy three stimulus pairs using FPF and three stimulus pairs using EC (see Fig. 1). Jimmy reached mastery criterion for all of the stimulus pairs taught using FPF and all of the stimulus pairs taught using EC. The assessment of maintenance was conducted during full probe sessions after Jimmy reached mastery criterion. The first assessment of maintenance for skills taught with FPF was 7 days (set 1), 3 days (set 2), and 5 days (set 3) after mastery criterion was met. The first assessment of maintenance for skills taught with EC was 5 days (set 1), 4 days (set 2), and 3 days (set 3) after mastery criterion was met. The final assessment of maintenance for skills taught with FPF was 58 days (set 1), 38 days (set 2), and 8 days (set 3) after mastery criterion was met. The final assessment of maintenance for skills taught with EC was 56 days (set 1), 37 days (set 2), and 6 days (set 3) after mastery criterion was met. During the assessment of maintenance, Jimmy’s mean correct responding on the stimulus pairs taught with FPF and EC was 90.2 % (range, 75–100 %) and 92.1 % (range, 87.5–100 %), respectively.

Fig. 1 Jimmy probe data

The researchers taught Rob three stimulus pairs using FPF and three stimulus pairs using EC (see Fig. 2). Rob reached mastery criterion for all of the stimulus pairs taught using FPF and all of the stimulus pairs taught using EC. The assessment of maintenance was conducted during full probe sessions after Rob reached mastery criterion. The first assessment of maintenance for skills taught with FPF and EC was 1 day (set 1), 1 day (set 2), and 4 days (set 3) after Rob reached mastery criterion. The final assessment of maintenance for skills taught with FPF and EC was 53 days (set 1), 31 days (set 2), and 7 days (set 3) after Rob reached mastery criterion. During the assessment of maintenance, Rob’s mean correct responding for all stimulus pairs taught with FPF and EC was 100 %.

Fig. 2 Rob probe data

The investigators taught Billy three stimulus pairs using FPF and three stimulus pairs using EC (see Fig. 3). Billy reached mastery criterion for all of the stimulus pairs taught using FPF and all of the stimulus pairs taught using EC. The assessment of maintenance was conducted during full probe sessions after Billy reached mastery criterion. The first assessment of maintenance for skills taught with FPF was 1 day (set 1), 7 days (set 2), and 4 days (set 3) after Billy reached mastery criterion. The first assessment of maintenance for skills taught with EC was 1 day (set 1), 6 days (set 2), and 4 days (set 3) after Billy reached mastery criterion. The final assessment of maintenance for skills taught with FPF was 51 days (set 1), 29 days (set 2), and 13 days (set 3) after Billy reached mastery criterion. The final assessment of maintenance for skills taught with EC was 51 days (set 1), 28 days (set 2), and 13 days (set 3) after Billy reached mastery criterion. During the assessment of maintenance, Billy’s mean correct responding on stimulus pairs taught with both FPF and EC was 98.6 % (range, 87.5 % to 100 %).

Fig. 3 Billy probe data

The investigators taught Kenny three stimulus pairs using FPF and three stimulus pairs using EC (see Fig. 4). Kenny reached mastery criterion for all of the stimulus pairs taught using FPF and all of the stimulus pairs taught using EC. The assessment of maintenance was conducted during full probe sessions after Kenny reached mastery criterion. The first assessment of maintenance for skills taught with FPF was 11 days (set 1) and 1 day (set 2 and set 3) after Kenny reached mastery criterion. The first assessment of maintenance for skills taught with EC was 1 day (set 1, set 2, and set 3) after Kenny reached mastery criterion. The final assessment of maintenance for skills taught with FPF was 39 days (set 1), 14 days (set 2), and 5 days (set 3) after Kenny reached mastery criterion. The final assessment of maintenance for skills taught with EC was 31 days (set 1), 14 days (set 2), and 5 days (set 3) after Kenny reached mastery criterion. During the assessment of maintenance, Kenny’s mean correct responding on stimulus pairs taught with FPF and EC was 92.8 % (range, 50 % to 100 %) and 97.9 % (range, 87.5 % to 100 %), respectively.

Fig. 4 Kenny probe data

Efficiency

The researchers measured the total amount of sessions, total amount of teaching trials, and total amount of time it took participants to reach mastery criterion across the two teaching methodologies (see Table 2). Data summarized across all participants indicated that targets taught with FPF required fewer sessions, trials, and total amount of teaching time to reach mastery criterion; however, results were idiosyncratic among the participants. Billy learned skills taught with EC in fewer sessions, trials, and total amount of teaching time than skills taught with FPF. Jimmy and Kenny learned skills taught with FPF in fewer sessions, trials, and total amount of teaching time than skills taught with EC. Rob learned skills in an equivalent number of sessions and trials with both teaching conditions; however, skills with FPF required less teaching time.

Table 2 Efficiency data

Participant	Total number of sessions (FPF)	Total number of sessions (EC)	Total number Total number of trials (FPF) of trials (EC)	Total amount of time (FPF) Min & sec	Total amount of time (EC) Min & sec
Jimmy	10	14	200 280	67:16	89:51
Rob	10	10	200 200	66:19	69:22
Billy	12	11	240 220	83:38	82:44
Kenny	11	14	220 280	93:20	130:12
Across all 43 participants		49	860 980	310:33	368:08

Participant Responding During the Two Teaching Conditions

The researchers measured participant responding during teaching trials across the two teaching conditions. Figure 5 reports the data for each individual participant across the two teaching conditions. The top panel represents the percentage of independent correct trials; the second panel represents the percentage of prompted trials (FPF only); the third panel represents the percentage of incorrect trials; the fourth panel represents the percentage of remedial trials, and the bottom panel represents number of trials to mastery. Across all participants, the overall correct responding was above 90 % across both teaching conditions. Across all participants, the overall correct responding, however, was higher for skills taught with the FPF condition than skills taught with the EC condition. Thus, both procedures resulted in low rates of incorrect responding across the two conditions.

Discussion

Results of this study indicated that both flexible prompt fading (FPF) and error correction (EC) were effective in teaching four children diagnosed with ASD how to expressively label pictures of Muppet characters. In terms of efficiency, across the four participants, results indicated that FPF was more efficient than EC in terms of the total number of teaching sessions, total number of teaching trials, and total amount of instructional time; although individual differences were seen. Furthermore, FPF resulted in fewer errors during teaching. Additionally, the EC condition resulted in better maintenance across the four participants; however, this could be a result of the extra teaching sessions, teaching trials, and teaching time. Anecdotally, Kenny’s incorrect responding during the final two full probe sessions was a result of giving silly answers (e.g., “Saxophone Zoot”) as opposed to the correct answer (e.g., “Zoot”). Thus, the results of this study showed that both prompting procedures can be effective in teaching children with autistic disorder expressive labeling skills and further expands the research on both FPF and EC in several ways.

First, this study provides further empirical support that flexible prompt fading is an effective prompting system that can result in learning for children with ASD. Flexible prompt fading is a prompting system that has been implemented with numerous children diagnosed with ASD (e.g., Leaf et al. 2011b), yet there have been a limited number of studies that have empirically evaluated flexible prompt fading (e.g., Soluaga et al. 2008). Results of this study were similar to the previous studies in that flexible prompt fading was found to be an effective prompting method. Furthermore, the results of this study showed that FPF, which is based upon clinical judgment, can be replicated across different participants and across different research sites.

Fig. 5 Participant responding during teaching trials

Second, many of the prompting procedures that are being implemented today require the therapist to adhere to a strict protocol. For example, in no-no prompting the teacher must always allow two independent trials before prompting the student on the third trial. In constant time delay, the teacher must wait a preset time before providing a prompt to the learner. In flexible prompt fading, however, there is not a fixed formula that a therapist must follow and, thus, he or she is able to make in the moment interventional decisions based on guidelines and parameters throughout teaching. Use of such clinical judgment during teaching allows the teacher to make real-time assessments of and adjustments to teaching procedures based on behaviors being displayed by the learner, which may lead to accelerated rates of learning.

Third, this study demonstrates further empirical proof that error correction procedures can be effective in teaching novel skills to children with ASD. Previous researchers have demonstrated that error correction procedures can be an effective teaching methodology to learn a wide variety of skills (e.g., Leaf et al. 2010; Smith et al. 2006; Worsdell et al. 2005). This study differed from some of the previous studies that implemented error correction procedures in that the participant was not required to respond to the instructional feedback nor were multiple trials presented (e.g., Rodgers and Iwata 1991; Worsdell et al. 2005). Instead, the participant was not required to respond to the instructional feedback and only one remedial trial was provided to the participant. Yet, the results of this study still showed that EC procedures were highly effective in teaching new skills to children with autism. Despite the positive results of this study, and previous research studies, there is still a belief that EC may result in slower skill acquisition than near-errorless procedures and should not be used when first teaching a new skill to a student with ASD (e.g., Gast 2011). This study, and other recent studies (e.g., Leaf et al. 2010), have shown that (anecdotally) error correction procedures do not necessarily result in aberrant behavior or slow the rate of skill acquisition, can be implemented successfully when teaching a student a new skill, and in some cases may even result in students learning skills at a quicker rate.

Today, there are several prompting procedures being implemented to children diagnosed with ASD. One of the goals for clinicians and researchers is to identify the most effective and efficient procedures. Thus, researchers have compared several prompting systems to determine the most efficacious prompting procedures. Results of most of these comparative studies have been mixed, both in terms of effectiveness and efficiency (e.g., Berkowitz 1990; Collier and Reid 1987; Leaf et al. 2010) In this study, both teaching procedures were nearly equally effective and efficient. Therefore, when teaching simple expressive labeling to high functioning students with ASD, it may not matter which prompting/error correction procedure is implemented, since both may result in quick skill acquisition. Clinicians may elect to use an FPF approach as opposed to EC when a student displays aberrant behaviors following incorrect responding or when an error can lead to a string of errors. Thus, an antecedent based prompting strategy may be more suited.

Additionally, the FPF procedure takes a lot of rapid decision making by the clinician, where the EC procedure allows the clinician to follow a more strict protocol. In this study, no specific training was provided to the instructors; however, all of the instructors had an extensive history of implementing discrete trial teaching and applied behavior analysis. Thus, the instructors had strong clinical skills such as: (a) the ability to make moment-to-moment analysis of a participants behaviors and responding; (b) complete understanding of a variety of prompting systems and prompting types; (c) the effective use of reinforcement; and (d) an understanding of functions of behavior.

Thus, it is important that clinicians, teachers, and parents are well trained (see Leaf et al. 2011a, b) and supervised prior to implementing FPF. If clinicians are not well trained or do not display the skills described above an EC procedure may be more appropriate. It would appear that future research is warranted to explore the success of FPF if implemented by clinicians with more varied levels of experience and in different settings in order to assess the generality of these findings.

Despite the positive results of this study, there are some limitations that can be found. First, we elected to use a parallel treatment design to compare the two prompting procedures. Ideally, when utilizing a parallel treatment design, it is desirable to obtain differences in rates of acquisition between the procedures being compared. In this study, however, the participants reached mastery criterion in a near equivalent amount of sessions; thus, some experimental control was lost. One way to minimize this limitation was to place the parallel treatment design within a multiple probe design to show that acquisition of skills occurred only once the interventions were implemented. Nevertheless, some experimental control was lost due to the quick skill acquisition during both teaching conditions. Despite this being a limitation to the research study, it is still important for clinicians to know that both procedures may be equally as effective.

Second, there are several potentially significant measures that were not evaluated in this study, including: aberrant behavior, participant preference, and teacher preference. Although aberrant behavior was not measured, anecdotally there was little to no aberrant behavior throughout the study. Future researchers may wish to directly measure participants’ aberrant behavior; specifically, it may be interesting to measure whether the provision of corrective feedback leads to any aberrant behaviors. It may also be of value to determine if the procedures result in differing levels of generalized prompt dependency. Additionally, future researchers may wish to use concurrent chain designs (Hanley et al. 1997, 2005) to measure participants’ preference for the two instructional procedures.

A third limitation of the study is that three of the participants had a previous history with the FPF condition and had no history of EC. This previous history may lead to quicker skill acquisition for targets taught with FPF as opposed to targets taught with EC. However, Kenny had no previous history with FPF or EC and his results were similar to the three participants who had a previous history with FPF. Nevertheless, future researchers should be careful to minimize the previous history participants may have with various teaching procedures when comparing those procedures in empirical studies.

A fourth limitation of the study is in regards to the treatment fidelity taken for the FPF condition. In this study, the researchers scored if the teachers displayed correct instructor behaviors (e.g., providing the correct instruction or consequence) and if the teacher decisions resulted in the outcomes specified by the guidelines of the protocol (e.g., participant correct responding was maintained at 80 % or above). However, no measure was taken on whether the teachers used correct “clinical” judgment. Defining and evaluating correct “clinical” judgment may be difficult as one teacher may elect to provide a prompt and another teacher may elect not to provide a prompt. Future researchers may wish to further define good “clinical” judgment and find empirical measures to evaluate the use of such judgment.

Limitations regarding the difficulty obtaining procedural reliability data when treatments involve clinical judgment are not uncommon or unique to FPF. For example, Wolery and Gast (1984) highlight this issue in the context of graduated guidance and suggest that this relative “lack of adequate procedural integrity makes effectiveness studies difficult to evaluate (p. 59)”. Other behavior change strategies, such as shaping, also rely on responsive clinical decision making. Despite these caveats, shaping and graduated guidance are clinical practices with great utility and evidence for their effectiveness. However, the question of how the current study and procedures can be systematically replicated across other learners and instructors is an empirical one.

A fifth limitation, is that during EC, when a participant is provided instructional feedback followed by a remedial trial the temporal distance is fairly close, which makes the procedure similar to FPF. Future researchers may wish to increase the time from the provision of instructional feedback to the start of the next teaching trial to see if this results in different behavioral change.

Finally, this study only demonstrated the effectiveness of FPF and EC when implemented in a one-to-one setting for a limited number of participants, all who can be considered higher functioning, and for teaching relatively simple skills. Future researchers should extend these findings by evaluating these procedures with more impacted children (e.g., lower IQ scores or higher rates of aberrant behavior), within small and large group instructional formats, and for more difficult skills. In such a manner it may be possible to evaluate the relative effectiveness and efficiency of the procedures with participants of varying characteristics and needs. In addition, in order to determine the most effective and efficient teaching procedures, FPF and EC should also be compared to other commonly implemented prompting systems (e.g., most-toleast, least-to-most, constant time delay).

References

Berkowitz, S. (1990). A comparison of two methods of prompting in training discrimination of communication book pictures by autistic students. Journal of Autism and Developmental Disorders, 20, 255– 262. doi:10.1007/BF02284722.

Bloh, C. (2008). Assessing transfer of stimulus control procedures across learners with autism. The Analysis of Verbal Behavior, 24, 87–101. Retrieved from http://www.abainternational.org/TAVB.asp.

Bozkurt, F., & Gursel, O. (2005). Effectiveness of constant time delay on teaching snack and drink preparation skills to children with mental retardation. Education and Training in Developmental Disabilities, 40, 390–400. Retrieved from http://www.daddcec.org/Publications/ETADDJournal.aspx.

Charlop, M. H., & Trashowech, J. E. (1991). Increasing autistic children daily spontaneous speech. Journal of Applied Behavior Analysis, 24, 747–761. doi:10.1901/jaba.1991.24-747.

Collier, D., & Reid, G. (1987). A comparison of two models designed to teach autistic children a motor task. Adapted Physical Activity Quarterly, 4, 228–236. Retrieved from http://www.journals.humankinetics.com/apaq.

Ferster, C. B., & DeMeyer, M. K. (1962). A method for the experimental analysis of behavior of autistic children. The American Journal of Orthopsychiatry, 32, 89–98. doi:10.1111/j.1939-0025.1962.tb.00267.x.

Gast, D. L. (2011). An experimental approach for selecting a response-prompting strategy for children with developmental disabilities. Evidenced-based Communication Assessment and Intervention, 5, 149– 155. doi:10.1080/17489539.2011.637358.

Gast, D. L., & Wolery, M. (1988). Parallel treatments design: a nested single subject design for comparing instructional procedures. Education and Treatment of Children, 11, 270–285. Retrieved from http://www.educationandtreatmentofchildren.net/

Hanley, G. P., Piazza, C. C., Fisher, W. W., Contrucci, S. A., & Maglieri, K. A. (1997). Evaluation of client preference for function-based treatment packages. Journal of Applied Behavior Analysis, 30, 459–473. doi:10.1901/jaba.1997.30-459 .

Hanley, G. P., Piazza, C. C., Fisher, W. W., & Maglieri, K. A. (2005). On the effectiveness of and preference for punishment and extinction components of function-based interventions. Journal of Applied Behavior Analysis, 38, 51–65. doi:10.1901/jaba.2005.6-04.

Leaf, R. B., & McEachin, J. J. (1999). A work in progress: Behavior management strategies and a curriculum for intensive behavioral treatment of autism. New York: Different Roads to Learning. Retrieved from http://www.difflearn.com/.

Leaf, J. B., Sheldon, J. B., & Sherman, J. A. (2010). Comparison of simultaneous prompting and no-no prompting in two choice discrimination learning with children with autism. Journal of Applied Behavior Analysis, 43, 215–228. doi:10.1901/jaba.2010.43-215.

Leaf, J. B., Oppenheim, M., Dotson, W., Johnson, V. A., Courtemanche, A. B., Sherman, J. A., et al. (2011a). Effects of no-no prompting on teaching expressive labeling of facial expressions to children with and without a pervasive developmental disorder. Education and Training in Developmental Disabilities, 46, 186–203. Retrieved from http://www.daddcec.org/Publications/ETADDJournal.aspx.

Leaf, R. B., Taubman, M., McEachin, J. J., Leaf, J. B., & Tsuji, K. H. (2011b). A program description of a community-based intensive behavioral intervention program for individuals with autism spectrum disorders. Education and Treatment of Children, 34, 259–285. doi:10.1353/etc.2011.0012.

Lovaas, O. I. (1987). Behavioral treatment and normal educational and intellectual functioning in young autistic children. Journal of Clinical and Consulting Psychology, 55, 3–9. doi:10.1037/0022-006x.55.1.3.

MacDuff, G. S., Krantz, P. J., & McClannahan, L. E. (1993). Teaching children with autism to use photographic activity schedules: maintenance and generalization of complex response chains. Journal of Applied Behavior Analysis, 26, 89–97.

Morse, T. E., & Schuster, J. W. (2000). Teaching elementary students with moderate intellectual disabilities how to shop for groceries. Exceptional Children, 66, 273–288. Retrieved from http://www.cec.sped/org/exceptionalchildren/.

Rodgers, T. A., & Iwata, B. A. (1991). An analysis of error-correction procedures during discrimination training. Journal of Applied Behavior Analysis, 24, 775–781. doi:10.1901/jaba.1991.24-775.

Schreibman, L. (1975). Effects of within-stimulus and extra-stimulus prompting on discrimination learning in autistic children. Journal of Applied Behavior Analysis, 8, 91–112. doi:10.1901/jaba.1975.8-91.

Schumaker, J., & Sherman, J. A. (1970). Training generative verb usage by imitation and reinforcement procedures. Journal of Applied Behavior Analysis, 3, 273–287. doi:10.1901/jaba.1970.3-273 .

Smith, T. (2001). Discrete trial training in the treatment of autism. Focus on Autism and Other Developmental Disabilities, 16, 86–92. doi:10.1177/108835760101600204 .

Smith, T., Mruzek, D. W., Wheat, L. A., & Hughes, C. (2006). Error correction in discrimination training for children with autism. Behavioral Interventions, 21, 245–263. doi:10.1002/bin.223.

Soluaga, D., Leaf, J. B., Taubman, M., McEachin, J., & Leaf, R. B. (2008). A comparison of flexible prompt fading and constant time delay for five children with autism. Research in Autism Spectrum Disorders, 2, 753–765. doi:10.1016/j.rasd.2008.03.005 .

Tarbox, R. S., Wallace, M. D., Penrod, B., & Tarbox, J. (2007). Effects of three-step prompting on compliance with care giver requests .Journal of Applied Behavior Analysis , 40, 703– 06.doi:10.1901/jaba.2007.703-706 . Wolery, M., & Gast, D. L. (1984). Effective and efficient procedures for the transfer of stimulus control. Topics in Early Childhood Special Education, 4(3), 52–77.

Worsdell, A. S., Iwata, B. A., Dozier, C. L., Johnson, A. D., Neidert, P. L., & Thomason, J. L. (2005). Analysis of response repetition as an error-correction strategy during sight-word reading. Journal of Applied Behavior Analysis, 38, 511–527. doi:10.1901/jaba.2005-115-04 .

An Evaluation of Positional Prompts for Teaching Receptive Identification to Individuals Diagnosed with Autism Spectrum Disorder

Jan 5, 2026

Justin B. Leaf & Joseph H. Cihon & Donna Townley-Cochran & Kevin Miller & Ronald Leaf & John McEachin & Mitchell Taubman
Published online: 22 September 2016
© Association for Behavior Analysis International 2016

Abstract

In this study, we evaluated the effects of positional prompts on teaching receptive identification to six children diagnosed with autism spectrum disorder (ASD). The researchers implemented a most-to-least prompting system using a three level hierarchy to teach receptive picture identification. Within the prompting hierarchy, only positional prompts were used. The most assistive prompt was placing the target stimulus 12 in. closer to the participant, the less assistive prompt was placing the target stimulus 6 in. closer to the participant, and no prompt was placing the target stimulus in line with the alternative stimuli. A non-concurrent multiple baseline design across behaviors was used to evaluate the effectiveness of the positional prompt. Results indicated that the implementation of positional prompts resulted in participants reaching mastery criterion and maintaining skills at follow-up for the majority of the participants. The results of the study have both future clinical and research implications.

Keywords: Autism . Most-to-least prompting . Prompts . Positional prompts . Transfer of stimulus control

Justin B. Leaf
Jblautpar@aol.com

Joseph H. Cihon
Jcihonautpar@aol.com

Donna Townley-Cochran
dcochranautpar@aol.com

Kevin Miller
k_millerautpar@aol.com

Ronald Leaf
Rlautpar@aol.com

John McEachin
jmautpar@aol.com

Mitchell Taubman
Mtautpar@aol.com

Autism Partnership Foundation, 200 Marina Drive, Seal Beach, CA 90740, USA

Prompts are often used throughout the course of instruction provided for individuals diagnosed with an autism spectrum disorder (ASD) (Green, 2001; Leaf et al., 2014a; MacDuff, Krantz, & McClannahan, 2001). Effective prompts are antecedent manipulations that alter the stimulus conditions in a manner that increases the likelihood of the desired response (Green, 2001; Grow & LeBlanc, 2013; MacDuff et al., 2001; Krantz & McClannahan, 1998; Wolery, Ault, & Doyle,

1992a). For instance, in the case of teaching receptive labels, the teacher may physically guide the learner’s hand to the correct stimulus in an array after delivering an instruction to “touch apple.” Prompts should be faded in a way that gradually shifts stimulus control from the auxiliary, extra, or artificial stimulus (MacDuff et al., 2001) to the stimulus that should occasion the learner’s response in the criterion environment.

Today, there are numerous prompt types utilized by individuals providing treatment for children diagnosed with ASD (Green, 2001; MacDuff et al., 2001). These include but are not limited to the following: verbal, modeling, manual, gestural, photographs and line drawing, and textual (see MacDuff et al., 2001 for a review). It can be difficult to determine when to provide a prompt, when to fade a prompt, and what level of assistance to provide. Therefore, several prompt fading systems have been developed which include, but are not limited to, constant time delay (e.g., Walker, 2008), progressive time delay (e.g., Walker, 2008), simultaneous prompting (e.g., Leaf, Sheldon, & Sherman, 2010), no-no prompting (e.g., Leaf et al., 2010), flexible prompt fading (e.g., Soluaga, Leaf, Taubman, McEachin, & Leaf, 2008), and least-to-most prompting (e.g., Libby, Weiss, Bancroft, & Ahearn, 2008). Another common prompt fading system is most-to-least prompting (e.g., Libby et al., 2008). In most-to-least prompting, the therapist begins by providing the most assistive prompt, usually a controlling prompt, which should guarantee a correct response (Wolery, Holcombe, Werts, & Cipolloni, 1992), and systematically fades to a less assistive prompt (e.g., a non-controlling-prompt which increases the likelihood of a correct response). Over time, the goal is to transfer stimulus control from the most assistive prompt to the desired controlling stimulus while limiting errors.

When using only one or a variety of prompt types, there is often a risk of prompt dependence or failure to transfer stimulus control to the desired stimuli. When this occurs, a learner might not engage in an approximation of the terminal response without the provision of a prompt or fail to respond correctly when prompts are completely faded. As a result of these two common concerns, some authors have recommended against the use of particular prompt types altogether. For example, Grow and LeBlanc (2013) recommend against the use of extra stimulus prompts, such as position prompts, to avoid establishing faulty stimulus control during teaching.

To use position prompts, the target item is moved closer to the student while the other item(s) in the array remains further away, which increases the likelihood of selecting the correct, closer item (Lovaas, 2003). For example, in a match-to-sample task using three stimuli, the teacher may place the target stimulus closer to the participant than the other stimuli and gradually move the target back to the original field. Transfer of stimulus control is displayed when the learner responds correctly with all of the stimuli equidistant from the learner. Despite recommendations against their use, position prompts have been commonly implemented in clinical settings, recommended in curricular books (e.g., Lovaas, 1981, 2003), and evaluated in empirical investigations when combined with other prompt types (e.g., Leaf et al., 2014a; Soluaga et al., 2008).

Soluaga et al. (2008) compared the use of a time delay prompting procedure to flexible prompt fading to teach receptive identification to five children diagnosed with ASD. The authors evaluated five different prompts (i.e., physical, gestural, 2D, reduction of the field, and positional) to determine what controlling prompt to use with the time delay prompting procedure. Additionally, positional prompts were used as part of teaching in the flexible prompt fading condition. The results of the study indicated that both prompting procedures were effective, and the efficiency was idiosyncratic to the learner. Although the results showed the effectiveness of the two prompting procedures, there were no data on how frequently positional prompts were implemented. Furthermore, data were not reported on the accuracy of the participants’ responses when positional prompts were used.

In 2014, Leaf and colleagues compared most-to-least prompting to error correction for two young children diagnosed with ASD. A four-step prompt hierarchy was used in the most-to-least procedure for both participants. A positional prompt was the second most assistive prompt used for one of the participants and the least assistive prompt for the other participant. The results of the study showed that both of the procedures were effective in teaching the participants the receptive tasks. The authors evaluated participant responding during teaching but did not specifically evaluate the rate of correct responding when a positional prompt was provided. Similarly, Leaf, Leaf, Taubman, McEachin, and Delmolino (2014b) compared flexible prompt fading to error correction to teach five children, diagnosed with ASD, to vocally state pictures of Muppet© characters. Once again, positional prompts were used as part of the flexible prompt fading procedures. Results of the study showed that across two different sites, participants learned the skills taught using the flexible prompt fading procedure. However, like the two previous studies, there was no data indicating participant responding when a positional prompt was presented.

While the use of positional prompts in combination with other prompt types (e.g., flexible prompt fading) has been shown to be an effective teaching tool, it is unclear if position prompts alone would yield similar results. Additionally, professional recommendations against the use of positional prompts (e.g., Grow & Leblanc, 2013) may result professionals avoiding the use of potentially effective prompt types. Therefore, research evaluating the use of positional prompts in the absence of other prompt types is warranted to determine if positional prompts can be effective and to evaluate best practice recommendations (e.g., Grow & Leblanc, 2013). The purpose of the present study was to evaluate the effects of positional prompts implemented in a most-to-least prompting system, in the absence of other prompt types, on receptive label acquisition with six children diagnosed with ASD.

Method

Participants

Six children all independently diagnosed with ASD participated in this study. All participants had a previous history with discrete trial teaching (DTT). Each participant had a learning history with flexible prompt fading (Soluaga et al., 2008) in which multiple prompt types were used (e.g., vocal prompts, reduction of the field prompts, physical prompts, model prompts, and multiple alternatives). However, each participant had minimal experience with positional prompts prior to this study. At the time of the study, all participants were receiving behavioral intervention which included programming for receptive labels. Table 1 provides the Peabody Picture Vocabulary Standard Score and Expressive One Word

Standard Score for each of the six participants.

Michael was a 9-year-old male who was placed in a general education classroom with behavioral supports. Michael could expressively label over 1000 items but rarely displayed spontaneous language. He could sustain attending for approximately 5 min prior to the start of the study. Michael displayed high rates of stereotypic behavior, including hand flapping, rocking, and making vocal noises. Michael had no previous history with positional prompting but had 6 years of experience with DTT.

Dwight was a 7-year-old male who was placed in a special education classroom. He could expressively label over 1000 words, displayed spontaneous language, could attend for up to 15 min in duration, and displayed moderate rates of stereotypic behavior. Dwight had no previous history with positional prompting but had 5 years of experience with DTT.

Andy was a 4-year-old male who was placed in a regular education classroom with supports. He communicated using full sentences, had age typical play skills, and displayed low rates of stereotypic behavior but frequently engaged in non-compliant behavior. Andy had minimal experience with positional prompts and had 1 year of experience with DTT.

Pam was a 4-year-old female who was receiving applied behavior analysis (ABA) intervention as her primary form of education. She communicated using full sentences, had age typical play skills, and displayed low rates of stereotypic behavior but frequently engaged in non-compliant behavior and tantrums. Pam had minimal experience with positional prompts and had 1 year of experience with DTT.

Jim was a 4-year-old male who was receiving ABA-based intervention as his primary form of education. He communicated using full sentences, had limited play skills, and displayed low rates of stereotypic behavior but frequently engaged in non-compliant behavior. Jim had no previous experience with positional prompts and had 6 months of experience with DTT.

Angela was a 4-year-old female who was receiving ABA based intervention as her primary form of education. She was fully conversational, had limited play skills, and displayed low rates of stereotypic behavior but frequently engaged in non-compliant behavior. Angela had minimal experience with positional prompts and had 6 months of experience with DTT. Setting

All sessions occurred in a small research room at a private clinic located in Southern California that provided comprehensive behavioral intervention for individuals diagnosed with ASD. The room contained a table and chairs as well as other furniture and educational materials (e.g., books). The table was marked with a minimally visible grid for use by the interventionists during the prompting condition. This was done in an effort for the position of the stimuli to remain consistent across days, trials, and interventionists. Sessions occurred once a day up to 5 days per week and lasted approximately 15 min.

Table 1: Participant demographic information

Participant	Expressive One Word Standard Score	Peabody Picture Vocabulary Standard Score
Michael	90	85
Dwight	97	109
Andy	145	120
Pam	145	132
Jim	106	90
Angela	84	81

Targets

Prior to baseline, the researchers met with the participants’ parents or clinical supervisor (i.e., the person in charge of developing programming) to determine the targets for the study. The researchers, parents, and/or clinical supervisors selected targets in which the participant’s peers showed interest so that the participant could participate in conversations about the targets and/or play appropriately with the targets. As such, the researchers selected unknown pictures of either cartoon characters, comic book characters, sports teams, or athletes for use during the intervention (see Table 2). Stimuli were introduced in sets of three with each set consisting of three different stimuli (i.e., nine pictures in total), with the exception of Angela with whom only two sets were used. None of the targets were available for teaching outside of research sessions.

Table 2: Skills taught to each participant

Participant	Set one	Set two	Set three
Michael	Ohio State^a, Notre Dame^b,Florida State^c	USC^a, Oregon^b, Alabama^c	TCU^a, Michigan State^b, Baylor^c
Dwight	Oregon^a, USC^b, Alabama^c	Dwight Howard^a, Carmelo Anthony^b, Tim Duncan^c	Mike Trout^aPeyton Manning^b, Kobe Bryant^c
Andy	Colossus^a, Sabertooth^b, Sentinel^c	Galactus^a, Apocolypse^b, Thanos^c	Modok^a, Task Master^bBaron Zemo^c,
Pam	Magneto^a, Professor X^b, Wolverine^c	Modok^a, Baron Zemo^b, Task Master^c	Mirror Master^a, Bane^b, Sinesto^c
Jim	Raptor^a, Trex^b, Stegosaurs^c	Galactus^a, Thanos^b, Apocolypse^c,	Mirror Master^a, Bane^b, Sinesto^c
Angela	Raptor^a, Trex^b, Stegosaurs^c	Glimmer^a, Terence^b, Merida^c

^aRepresents target 1 for each set ^bRepresents target 2 for each set ^cRepresents target 3 for each set

Behavior Coding

The researchers implemented conventional DTT within both probe sessions (see below) and teaching sessions (see below). A response was defined as the first stimulus the participant touched after the instruction. The participants’ hands were not allowed to be in contact with any of the stimuli at the onset of the trial. On each trial during probe sessions, the participants’ responses were categorized as correct, incorrect, or no response. A correct response was defined as anytime the participant touched the target stimulus within 5 s of the instruction. An incorrect response was defined as anytime the participant touched a stimulus that did not correspond with the interventionist’s instruction, touched two stimuli simultaneously, or stated she/he did not know the correct stimulus (e.g., BI don’t know^). No response was defined as anytime the participant did not touch any stimulus within 5 s of the interventionist’s instruction.

On each trial during teaching, participant responding was also categorized as correct, incorrect, or no response as defined above. In addition to these measures, prompted correct and incorrect responses were measured during teaching sessions. A prompted correct response was defined as anytime the participant touched the target stimulus within 5 s of the instruction when the target stimulus was 0 or 6 in. from the participant. A prompted incorrect response was defined as anytime the participant touched any stimulus that did not correspond with the interventionist’s instruction, touched two stimuli simultaneously, touched none of the stimuli, or stated she/he did not know the correct stimulus (e.g., BI don’t know^) when the target stimulus was placed 0 or 6 in. away from the participant.

The data sheets used for scoring trials during daily probe and teaching sessions were designed based on Grow and LeBlanc’s (2013) best practice recommendations for receptive language instruction (see Fig. 1 in Grow and Leblanc for a detailed example). The data sheets were designed in an effort to ensure counterbalancing of the three stimuli within each comparison array and the correct stimuli in a semi-randomized fashion.

Fig. 1 Closed circles indicate the percentage of trials with correct responses during daily probe sessions; open squares and closed triangles indicate the percentage of trials with independent and prompted correct responses during teaching sessions, respectively

Dependent Measures

The primary dependent variable was the percentage of trials with correct responding during daily probe sessions (described below). The researchers took the total number of correct trials and divided by the total number of trials to determine the percentage of correct responses per session. Daily probe sessions were used to determine baseline levels, mastery criterion, and maintenance. The mastery criterion was defined as 100 % correct on all targets within a set across three consecutive daily probe sessions.

The second measure evaluated within this study was the total percentage of correct and correct prompted responses during teaching sessions. To calculate the percentage of correct responses, the researchers added the total number of correct responses and divided by the total number of trials and multiplied by 100. To calculate the percentage of prompted correct responses, the researchers added the total number of prompted correct responses and divided by the total number of trials and multiplied by 100. Percentage of prompted correct responses was calculated for each level of prompt (i.e., 0, 6, 12 in.). We also evaluated the total percentage of correct responding across all prompt levels and participants.

The third measure was the overall percentage of correct responding across the three prompting levels (see below) for each set and the overall percentage correct across all teaching sessions. The final measure of the study was a trial-by-trial analysis of the prompt level provided for each target, for each participant, across all sets. The researchers analyzed each trial during teaching sessions, and what prompt level was provided.

Daily Probe Sessions

The interventionists implemented daily probe sessions during baseline, intervention, and maintenance. Daily probe sessions consisted of nine total trials; three for each stimulus. The comparison array was counterbalanced across trials so that the correct comparison was present in each location (i.e., right, center, left) an equal number of times. To begin a trial, the interventionist presented the comparison array 12 in. from the edge of the table where the participant was seated. The interventionist then delivered an instruction to select one of the stimuli (e.g., “Touch Sabertooth”). The interventionist provided 5 s for the participant to respond. If the participant did not respond within 5 s, the interventionist instructed the participant to make a selection (e.g., “You need to try”). Following a response, regardless of accuracy, the interventionist responded with a neutral statement (e.g., “Thanks” or “Thank you”). No programmed reinforcement was delivered for correct, incorrect, or no responses; however, praise was delivered for general compliance (e.g., sitting at the table, touching rather than picking up the correct stimulus) and not engaging in any other aberrant behavior.

Baseline

Baseline consisted of one daily probe (described above) per session. After the daily probe, the interventionist returned the participant back to his or her regularly planned treatment session.

Intervention

No daily probe session was conducted on the first day of intervention for each set. All intervention sessions following the first day of intervention started with a daily probe, followed by a short break before teaching. Trials during teaching sessions were similar to trials during daily probe sessions in that the comparison array and the item targeted were counterbalanced across trials. Each stimulus within the set was the target for six trials with a total of 18 trials during each intervention session.

Positional prompts served as the only prompt type during intervention. To help ensure procedural fidelity, the interventionists lightly marked the table to identify the location of the stimuli and where the positional prompt should be provided (the same table with marks was used during both baseline and the maintenance conditions). The marks were provided so the position of the stimuli remained consistent across days, trials, and interventionists. On each level, there were three marks provided to identify the target on the left, center, and right positions. The three positions were marked X, Y, and Z. The X position was placed on the edge of the table (0 in. away from the participant) closest to where participants were sitting and represented the most assistive prompt provided within teaching sessions. The Y position was placed 6 in. from the edge of the table. The Z position was the location the stimuli were located on each trial during daily probe sessions (i.e., 12 in. away from the edge of the table where the participant was seated). When the target stimuli were 12 in. from the participant (i.e., the Z position), no prompt was provided.

The interventionist used a most-to-least prompting procedure (MacDuff et al., 2001) during all teaching sessions. The most-to-least prompting procedure consisted of moving the target stimulus closer or farther away from the participant. The interventionists used a three-level prompting hierarchy: when the target was 0 in. from the participant, it was considered the most assistive prompt; when the target was 6 in. from the participant, it was considered the second most assistive prompt; and when the target was 12 in. from the participant, there was no prompt provided. Across all three levels, only the target stimulus was placed closer (i.e., 0 or 6 in.) while the other two targets were always placed 12 in. from the participant.

A prompting hierarchy was applied to each stimulus such that each stimulus could move up and down the hierarchy regardless of performance with the other stimuli. The criterion to move to a less assistive prompt was the participant engaging in two correct prompted responses in a row. For example, if the participant selected the correct stimulus on two consecutive trials with the stimulus 0 in. from the participant, that stimulus would be presented 6 in. from the participant on the following trial with that stimulus. The criterion to move to a more assistive prompt was the participant engaging in a single incorrect or prompted incorrect response. For example, if the participant selected an incorrect stimulus when that stimulus was 12 in. from the participant (i.e., no prompt), the next trial that stimulus would be placed 6 in. from the participant. On the first teaching session, the target stimulus always started 0 in. from the participant (i.e., most-to-least prompting).

To begin a trial, the interventionist presented the stimulus array with the correct stimulus located in the predetermined position both horizontally (i.e., left, middle, or right) and vertically (i.e., 0, 6, or 12 in. away from the participant). Then the interventionist delivered an instruction to select the targeted stimulus for that trial (e.g., “Where is Thanos?”). If the participant selected the correct stimulus from the array, the interventionist provided praise (e.g., “That’s right!”) and moved on to the next trial. Praise was selected because it had been demonstrated to be an effective reinforcer during clinical sessions. Moreover, conditioning social praise as a reinforcer was part of all of the participants’ daily programming (Leaf et al., 2016). The names of the stimuli were not provided during feedback. If the participant selected the incorrect stimulus on any trial, the interventionist provided informational feedback (e.g., “That’s not it.”) and moved on to the next trial with a more assistive prompt.

Intervention Prime

After seven sessions with the first set, Jim and Angela were responding at levels similar to baseline that could still be considered chance levels. The researchers hypothesized that praise may not have served as a reinforcer, so a token reinforcement system was implemented. A token paired with praise was delivered contingent on a correct response. The completed token board was then exchanged for access to a treasure chest that contained various toys (e.g., swords, cars, putty).

Maintenance

Maintenance was identical to the baseline (see above). Maintenance data were taken an average of 15 days (range, 2 to 23), 4 days (range, 2 to 6), and 4 days (range, 2 to 7) after teaching had concluded for Michael on sets 1, 2, and 3, respectively. An average of 7 days (range, 3 to 19), 10 days (range, 7 to 14), and 9 days (range, 5 to 14) after teaching had concluded for Dwight on sets 1, 2, and 3, respectively. Maintenance data were taken on an average of9 days(range, 2 to 15), 8 days (range, 2 to 13), and 6 days (range, 4 to 7) after teaching had concluded for Andy on sets 1, 2, and 3, respectively. An average of 6 days (range, 4 to 9), 24 days (range, 14 to 29), and 7 days (range, 3 to 10) after teaching had concluded for Pam on sets 1, 2, and 3, respectively.

Because Jim did not reach mastery criterion on set 1, maintenance data were not taken. Maintenance data were taken an average of 15 days (range, 13 to 17) and 12 days (range, 9 to 14) after teaching had concluded for Jim for sets 2 and 3, respectively. Because Angela never reached mastery criterion on set 1 or set 2, maintenance data were not collected.

Experimental Design

A non-concurrent multiple baseline across stimuli design (Harvey, May, & Kennedy, 2004; Watson & Workman,

1981) and replicated across participants was utilized to evaluate the effects of positional prompts on the acquisition of the different targets with each of the participants. Sessions lasted up to 15 min and occurred 2 to 5 days a week dependent upon participant availability (e.g., if the participant had a session in the clinical office).

Interobserver Agreement and Treatment Fidelity

The interventionist recorded responding on each trial during each daily probe session and teaching session. A second observer independently recorded responding on each trial during 36.2 % of daily probe sessions (range, 33 to 40.9 % across participants) and 36.4 % of teaching sessions (range, 28.5 to 36.8 % across participants). Agreement was defined as both observers marking the same response occurring on the same trial. Interobserver agreement was calculated by dividing the total number of agreements divided by the total number of agreements plus disagreements and multiplied by 100. Percentage agreement across probes was 99.6 % (range, 97.7 to 100 % across participants) and 100 % for teaching sessions.

Treatment fidelity was assessed to ensure the interventionist implemented daily probe sessions correctly across baseline, intervention, and maintenance conditions. An independent observer recorded the interventionists’ implementation of daily probe sessions in 31.1 % of sessions across the baseline, intervention, and maintenance conditions. Correct interventionist behaviors were (1) placing the comparison array in the correct positions as indicated by the data sheet, (2) providing the correct instruction, (3) providing 5 s for the participant to respond, and (4) providing neutral feedback regardless of accuracy. The interventionists implemented trials during daily probe sessions correctly during 99.8 % of sessions (range, 99.3 to 100 % across participants).

Treatment fidelity was also assessed to ensure the interventionist implemented the trials during teaching sessions correctly during the intervention condition. An independent observer recorded the interventionists’ implementation of the positional prompting hierarchy during 31.4 % of teaching sessions. Correct interventionist behaviors were (1) placing the comparison array in the correct positions as indicated by the data sheet, (2) placing the correct comparison in the correct position (i.e., X, Y, or Z) on the table, (3) providing 5 s for the participant to respond, (4) providing praise, or a token during sessions with Jim and Angela, following a correct response, and (5) providing corrective feedback following an incorrect response. The interventionists implemented the positional prompting procedure correctly during 99.6 % of trials (range, 97.2 to 100 % across participants).

Results

Performance on Daily Probe Sessions

Figures 1, 2, 3, 4, 5, and 6 display the percentage of correct trials during daily probe sessions for each of the six participants. These data are depicted by closed circles. Michael reached the mastery criterion across all three sets of stimuli. During the baseline condition, Michael responded correctly on a low percentage of trials during probe sessions, near chance levels. Michael reached mastery criterion in 3, 7, and 3 daily probe sessions for the first, second, and third set of stimuli, respectively. Michael responded correctly on all trials during teaching (i.e., 100 % of trials) in the maintenance condition across all three sets of stimuli.

Dwight also reached the mastery criterion on all three sets of stimuli. During the baseline condition, Dwight responded correctly on a low percentage of trials during probe sessions, near chance levels. Dwight reached mastery criterion in 6, 4, and 4 daily probe sessions for the first, second, and third set of stimuli, respectively. Michael responded correctly on 100 % of trials on all but one daily probe session during the assessment of maintenance.

Andy reached mastery criterion on all three sets of stimuli. During the baseline condition, Andy responded correctly on a low percentage of trials during probe sessions. Although, there was increase in correct responding during baseline with set 3, performance was still at chance levels. Andy reached mastery criterion in 3, 4, and 4 daily probe sessions for the first, second, and third set of stimuli, respectively. Andy responded correctly on 100 % of trials on all but one daily probe session during the assessment of maintenance.

Pam reached mastery criterion on all three sets of stimuli. During the baseline condition, Pam responded correctly on a low percentage of trials during probe sessions, near chance levels. Pam reached mastery criterion in 4,6,and 4 daily probe sessions for the first, second, and third set of stimuli, respectively. Pam responded correctly on 100 % of trials on all but two daily probe sessions during the assessment of maintenance.

Jim had varied levels of correct responding across the three sets. For the first set, Jim responded correctly on a low percentage of trials during the baseline condition. After seven sessions of intervention, no improvement was observed and a token reinforcement system was implemented (i.e., intervention-prime). After five sessions within this condition, Jim continued to show no improvement and due to ethical reasons (e.g., not prolonging unsuccessful intervention) intervention was discontinued. For the second set, Jim displayed variable levels of correct responding during the baseline condition. Jim reached mastery criterion on the 13th session in the intervention-prime condition. Jim continued to respond correctly on 100 % of trials during all sessions in the maintenance condition. For the third set of stimuli, Jim again displayed variable levels of correct responding in the baseline condition. During the intervention-prime condition, Jim did not reach mastery criterion; however, he did respond correctly on a high percentage of trials. During the assessment of maintenance, he responded incorrectly on one trial during each session.

Angela did not reach the mastery criterion with either set of stimuli. No improvement in correct responding was observed with the first set of stimuli in either the intervention or intervention-prime condition, and due to ethical considerations (e.g., not prolonging unsuccessful intervention), intervention was discontinued. However, to ensure the results were not idiosyncratic to the first set of stimuli, a second set of stimuli were introduced. After seven sessions of intervention-prime, no improvement in correct responding was observed and a third set was not introduced.

Fig. 2 Closed circles indicate the percentage of trials with correct responses during daily probe sessions; open squares and closed triangles indicate the percentage of trials with independent and
prompted correct responses during teaching sessions, respectively.

Fig. 3 Closed circles indicate the percentage of trials with correct responses during daily probe sessions; open squares and closed triangles indicate the percentage of trials with independent and
prompted correct responses during teaching sessions, respectively.

Responding During Intervention

Figures 1, 2, 3, 4, 5, and 6 also report the percentage of trials during teaching sessions in which each participant responded independently correct (open squares) and correctly with the provision of a prompt (closed triangles). For Michael, Dwight, Andy, and Pam, two trends emerge. First, an inverse relationship in which correct independent responding increases with a corresponding decrease in prompted correct responses. Second, a quick increase in correct independent responding following the first session of intervention.

Different patterns of responding were observed for Jim and Angela. For Jim, low levels of correct responding with or without a prompt throughout intervention were observed with set 1. However, there was an increase in correct independent responding as intervention continued with sets 2 and 3. For Angela, correct responding was typically only observed when a prompt was used during intervention with set 1. With the introduction of set 2, however, Angela responded correctly without prompting on a high percentage of trials throughout intervention-prime.

Fig. 4 Closed circles indicate the percentage of trials with correct responses during daily probe sessions; open squares and closed triangles indicate the percentage of trials with independent and
prompted correct responses during teaching sessions, respectively.

Fig. 5 Closed circles indicate the percentage of trials with correct responses during daily probe sessions; open squares and closed triangles indicate the percentage of trials with independent and
prompted correct responses during teaching sessions, respectively

Table 3 displays the overall percentage of correct responding during teaching and the percentage of correct responding when the target was placed 0, 6, and 12 in. from the participant. Michael, Dwight, Andy, and Pam all responded correctly on a high percentage of trials across all prompt levels (i.e., above 95, 98, 96, and 95 % across all three sets, respectively). Pam also responded incorrectly on a higher percentage of trials when the most assistive prompt was provided. Jim responded correctly around 75 % of trials across all teaching sessions. Jim also responded incorrectly on a high percentage of trials with set 1 when the most assistive prompt was provided. Angela responded correctly around 78 % of all teaching sessions; however, correct responding typically only occurred when a prompt was provided.

Table 3 Participant percentage of correct responding at each prompt level for each stimulus set during teaching

Participant	Prompt level (in.)	Set 1 (%)	Set 2 (%)	Set 3 (%)	Total (%)
Michael	0	75.0	85.7	100.0	85.7
	6	100.0	100.0	100.0	100.0
	12	94.7	95.2	100.0	96.2
	Total	92.5	95.2	100.0	95.7
Dwight	0	100.0	100.0	100.0	100.0
	6	77.0	100.0	100.0	93.1
	12	99.3	94.4	96.2	98.2
	Total	98.7	95.8	97.1	98.0
Andy	0	100.0	100.0	100.0	100.0
	6	100.0	100.0	93.7	97.0
	12	100.0	94.4	95.7	96.5
	Total	100.0	95.8	95.8	96.4
Pam	0	64.7	72.7	75.0	69.4
	6	80.0	100.0	100.0	92.5
	12	100.0	97.8	97.3	98.4
	Total	88.8	95.4	94.2	93.6
Jim	0	14.4	100.0	100.0	25.2
	6	97.5	100.0	98.3	98.7
	12	83.3	80.9	79.2	80.9
	Total	55.1	86.7	85.8	75.3
Angela	0	100.0	100.0	–	100.0
	6	96.2	97.3	–	96.5
	12	36.4	81.2	–	56.8
	Total	73.8	87.3	–	78.3

Fig. 6 Closed circles indicate the percentage of trials with correct responses during daily probe sessions; open squares and closed triangles indicate the percentage of trials with independent and prompted correct responses during teaching sessions, respectively

Trial-by-Trial Analysis of Prompts During Teaching

Figures 7 and 8 provide a trial-by-trial analysis of when a prompt was provided for each of the targets across all sets and participants. Each panel represents a different participant and the phase’s change lines represent when a different set was introduced. The x-axis represents each trial during teaching sessions, and the y-axis represents the prompt level that was provided. All three targets are represented in the panel, with target 1 (closed circles) on the bottom, target 2 (open squares) in the middle, and target 3 (closed triangles) on the top. There are three different prompt levels per target (i.e. 0, 6, and 12 in.) along the y-axis. Thus, upward movement indicates the prompt was faded, and downward movement indicates that a more assistive prompt was provided.

For Michael, Dwight, Andy, and Pam, the data displays a quick progression from the most assistive prompt to no prompt across all targets and sets, with few trials in which an incorrect response occurred. For Jim, the data displays several occurrences in which the assistance of the prompt was faded; however, once the prompt was faded completely for the targets in set 1 and 2, incorrect responding occurred and prompts were reintroduced. For Angela, the data displays a similar pattern of responding to Jim across all targets and sets with consistent incorrect responding when prompts were completely faded.

Fig. 7 Each data path represents one stimulus in the set. Upward movement indicates fading to a less assistive prompt, while downward movement indicates moving to a more assistive prompt

Fig. 8 Each data path represents one stimulus in the set. Upward movement indicates fading to a less assistive prompt, while downward movement indicates moving to a more assistive prompt

Discussion

This study examined the effectiveness of positional prompts when used to teach receptive labels for six individuals diagnosed with ASD. Four of the six participants (i.e., Michael, Dwight, Andy, and Pam) reached the mastery criterion with all three sets of stimuli. Furthermore, these four participants responded correctly on a high percentage of trials during the maintenance condition. With the addition of a token reinforcement system, Jim reached the mastery criterion with the stimuli in set 2 and reached high levels of correct responding with set 3. Angela did not reach mastery criteria with either set of stimuli that were introduced and, as a result, a third set of stimuli was not introduced. The results also indicated a high percentage of correct responding (independent and prompted) for five of the six participants in the study. Thus, the results of the study showed that the implementation and fading of positional prompts was effective in skill acquisition for most participants. This finding provides clinicians with some additional evidence that positional prompt types could be implemented in clinical practice.

Although positional prompts were effective for the majority of participants, these results were not observed for Jim and Angela. Several potential variables may have prevented Jim and Angela from reachingthe mastery criterion on some of the stimulus sets. First, it is possible that an effective reinforcer (i.e., praise or the token economy) was not identified. Future researchers may wish to first conduct preference assessments (Fisher et al., 1992) or in-the-moment reinforcer analysis (Leaf et al., in press) to determine effective reinforcers prior to intervention.

Second, the failure to reach mastery criterion may be due to the manner in which skill acquisition was measured. In this study, the researchers implemented probe sessions to determine mastery of each set. No feedback (i.e., reinforcement or punishment) was provided during probe sessions and therefore may have had an extinction effect for both Jim and Angela. The case for extinction can especially be made for Angela as she responded incorrectly on the majority of trials during probe sessions but responded correctly on the majority of trials during teaching. Another reason Angela and Jim may have responded differently is the interventionists followed strict, prescribed rules for movement between the different prompt levels. It is possible that transfer of stimulus control may have occurred if the fading of the prompt was varied or occurred slower (e.g., Soluaga et al., 2008).

Third, it could be that Jim and Angela had a shorter history of receiving ABA services (i.e., 6 months compared to 12 months or more for the other participants). This difference could have resulted in Jim and Angela missing some component skills required for the intervention under investigation. Finally, it could be that the use of positional prompts in a most-to-least prompt fading system failed to transfer stimulus control from the prompt to the instruction alone. Future researchers may wish to evaluate positional prompts in other prompt fading systems (e.g., constant time delay, flexible prompt fading, or no-no prompt) to determine if transferring stimulus control could be achieved.

Despite the encouraging results of this study, several limitations could be addressed by future research. First, the interventionists were governed by a strict set of rules for when to move up and down the prompt hierarchy, which may not align with typical clinical practices. The strict rules prevented the interventionists from making needed adjustments to the prompt level based on the participants’ behavior in the moment. Researchers have previously argued for the use of more flexible prompting systems which may result in better learning (Leaf et al., 2014b, 2016). For prompts to be successfully faded, different learners may require individualized fading steps and future researchers may examine potential predictors for when prompts should be faded quickly or more slowly. Second, the authors elected to use a non-concurrent multiple baseline design across stimulus sets as opposed to a concurrent multiple baseline design. The authors selected a nonconcurrent design, as it was more practical in this particular setting because it took less time away from clinical sessions per day. Although this design does control for threats to internal validity (Harvey, May, & Kennedy, 2004; Watson & Workman, 1981) and is commonly implemented in research studies, the use of a concurrent multiple baseline design may be desired by future researchers. The failure for some learners to acquire the targeted skill using positional prompts, like in the case of Jim and Angela, may lead some researchers to suggest avoiding extra-stimulus
prompts, such as positional prompts, altogether (e.g., Grow & LeBlanc, 2013). Suggesting teachers to avoid the use of these prompts is based on the assumption that it inherently leads to faulty stimulus control. Although Jim and Angela failed to learn the targeted labels, there was no evidence of the type of faulty stimulus control that has been discussed in the literature (e.g., Koegel & Rincover, 1976; Green, 2001). Furthermore, the results from the other four participants (i.e., Michael, Dwight, Andy, and Pam) indicate that positional prompts can be an effective means of developing the desired stimulus control. It is important to note that transfer of stimulus control occurs during the fading of the prompt (Etzel & LeBlanc, 1979; Touchette, 1971; Zygmont, Lazar, Dube, & McIlvane, 1992). Perhaps future research should examine the best practice in successfully fading various prompt types to develop desired stimulus control, rather than determining which prompt itself is the best practice.

Compliance with Ethical Standards

No funding was received for this study. All procedures performed in studies involving human participants were in accordance with ethical standards of the institutional research committee and with 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Informed consent was obtained from the parents of all individual participants included in the study.

References

Etzel, B. C., & LeBlanc, J. M. (1979). The simplest treatment alternative: the law of parsimony applied to choosing appropriate instructional control and errorless-learning procedures for the difficult-to-teach child. Journal of Autism and Developmental Disorders, 9(4), 361– 382.

Green, G. (2001). Behavior analytic instruction for learners with autism advances in stimulus control technology. Focus on Autism and Other Developmental Disabilities, 16(2), 72–85.

Grow, L., & LeBlanc, L. (2013). Teaching receptive language skills: recommendations for instructors. Behavior Analysis in Practice, 6(1), 56–75.

Harvey, M. T., May, M. E., & Kennedy, C. H. (2004). Nonconcurrent multiple baseline designs and the evaluation of educational systems. Journal of Behavioral Education, 13, 267–276.

Koegel, R. L., & Rincover, A. (1976). Some detrimental effects of using extra stimuli to guide learning in normal and autistic children. Journal of Abnormal Child Psychology, 4, 59–71.

Krantz, P. J., & McClannahan, L. E. (1998) Social interaction skills for children with autism: a script-fading procedure for beginning readers. Journal of Applied Behavior Analysis, 31(2), 191–202

Leaf, J. B., Alcalay, A., Leaf, J. A., Tsuji, K., Kassardjian, A., Dale, S., et al. (2014a). Comparison of most-to-least to error correction for teaching receptive labelling for two children diagnosed with autism. Journal of Research in Special Educational Needs. doi:10.1111/ 1471-3802.12067.

Leaf, J. B., Leaf, R., Leaf, J. A., Alcalay, A., Ravid, D., Dale, S., Kassardjian, A., Tsuji, K., Taubman, M., McEachin, J., & Oppenheim-Leaf, M. L. (in press). Comparing paired-stimulus preference assessments to the in-the-moment reinforcer analysis on skill acquisition: a preliminary analysis. Focus on Autism and Other Developmental Disabilities.

Leaf, J. B., Leaf, R., McEachin, J., Taubman, M., Ala’i-Rosales, S., Ross, R. K., et al. (2016). Applied behavior analysis is a science and, therefore, progressive. Journal of Autism and Developmental Disorders, 46(2), 720–731.

Leaf, J. B., Leaf, R., Taubman, M., McEachin, J., & Delmolino, L. (2014b). Comparison of flexible prompt fading to error correction for children with autism spectrum disorder. Journal of Developmental and Physical Disabilities, 26(2), 203–224.

Leaf, J. B., Sheldon, J. B., & Sherman, J. A. (2010). Comparison of simultaneous prompting and no-no prompting in two-choice discrimination learning with children with autism. Journal of Applied Behavior Analysis, 43, 215–228.

Libby, M. E., Weiss, J. S., Bancroft, S., & Ahearn, W. H. (2008). A comparison of most-to-least and least-to-most prompting on acquisitionofsolitary play skills.BehaviorAnalysis inPractice, 1, 37–43.

Lovaas, O. I. (1981). Teaching developmentally disabled children: the me book. Austin, TX: PRO-ED Books.

Lovaas, O. I. (2003). Teaching individuals with developmental delays: basic intervention techniques. Austin, TX: PRO-ED Books.

MacDuff, G. S., Krantz, P. J., & McClannahan, L. E. (2001). Prompts and prompt-fading strategies for people with autism. In C. Maurice, G. Green, & R. M. Foxx (Eds.), Making a difference behavioral intervention for autism (1st ed., pp. 37–50). Austin, TX: Pro Ed.

Soluaga, D., Leaf, J. B., Taubman, M., McEachin, J., & Leaf, R. (2008). A comparison of flexible prompt fading and constant time delay for five children with autism. Research in Autism Spectrum Disorders, 2, 753–765.

Touchette, P. E. (1971). Transfer of stimulus control: measuring the moment of transfer. Journal of the Experimental Analysis of Behavior, 15(3), 347–354.

Walker, G. (2008). Constant and progressive time delay procedures for teaching children with autism: a literature review. Journal of Autism and Developmental Disorders, 38, 261–275.

Watson, P. J., & Workman, E. A. (1981). The non-concurrent multiple baseline across-individuals design: an extension of the traditional multiple baseline design. Journal of Behavioral Therapy and Experimental Psychiatry, 12, 257–259.

Wolery, M., Ault, M. J., & Doyle, P. M. (1992a). Teaching students with moderate to severe disabilities: Use of response prompting strategies. White Plains, NY: Longman.

Wolery, M., Holcombe, A., Werts, M. G., & Cipolloni, R. M. (1992b). Effects of simultaneous prompting and instructive feedback. Early Education & Development, 4(1), 20–31.

Zygmont, D. M., Lazar, R. M., Dube, W. V., & McIlvane, W. J. (1992). Teaching arbitrary matching via sample stimulus-control shaping to young children and mentally retarded individuals: a methodological note. Journal of the Experimental Analysis of Behavior, 57(1), 109– 117.

Increasing Instructional Efficiency By Presenting Additional Stimuli In Learning Trial For Children With Autism Spectrum Disorders

Jan 5, 2026

Jason C. Vladescu
Caldwell College
And
Tiffany M. Kodak
University of Oregon

Abstract

The current study examined the effectiveness and efficiency of presenting secondary targets within learning trials for 4 children with an autism spectrum disorder. Specifically, we compared 4 instructional conditions using a progressive prompt delay. In 3 conditions, we presented secondary targets in the antecedent or consequence portion of learning trials, or in the absence of prompts and reinforcement. In the fourth condition (control), we did not include secondary targets in learning trials. Results replicate and extend previous research by demonstrating that the majority of participants acquired secondary targets presented in the antecedent and consequent events of learning trials.

Key words: autism spectrum disorders, discrete-trial instruction, instructional efficiency, instructive feedback

We thank Elizabeth Bullington, Regina Carroll, Andrea Clements, Lindsey Loutsch, Laura Mulford, Melissa Nissen, Carissa Nohr, and Jessi Sexton for their assistance with various aspects of data collection.
Address correspondence to Jason C. Vladescu, Department of Applied Behavior Analysis, Caldwell College, 120 Bloomfield Avenue, Caldwell, New Jersey 07006 (e-mail: jvladescu@caldwell.edu).
doi: 10.1002/jaba.70

Although discrete-trial instruction (DTI) is an effective teaching practice for many learners with an autism spectrum disorder (ASD), it may not close the gap between their skill level and that of their typically developing peers. Therefore, it is important to identify procedures that further increase the efficiency of this instruction format. Determinations regarding instructional efficiency have centered on how rapidly acquisition occurs during one instructional method compared to another. Comparisons may be made based on time or trials to criterion (e.g., Ingvarsson & Hollobaugh, 2011). It has also been suggested that conclusions regarding efficiency be based on the number of skills acquired and the effects on future learning (e.g., Reichow & Wolery, 2011; M. Wolery, Werts, & Holcombe, 1993; T. D. Wolery, Schuster, & Collins, 2000). The latter conceptualizations have been explored through the presentation of additional stimuli within learning trials (typically prior to or immediately following a response opportunity in the presence of a target stimulus). These additional stimuli are presented without requiring a response from the learner or programming consequences if the learner does engage in a correct response (Anthony, Wolery, Werts, Caldwell, & Snyder, 1996; Werts, Wolery, Holcombe, & Frederick, 1993).

Some studies presented secondary targets in the antecedent portion of learning trials (i.e., prior to the presentation of the target discriminative stimulus; e.g., M. Wolery, Ault, Gast, Doyle, & Mills, 1990), whereas others presented secondary targets during the consequence portion of learning trials (i.e., following the consequence provided contingent on the learner’s behavior; e.g., Cromer, Schuster, Collins, & Grisham-Brown, 1998; M. Wolery et al., 1991). For example, while teaching sight words to three teenagers diagnosed with intellectual disabilities, T. D. Wolery et al. (2000) compared an instructional condition in which teachers presented secondary targets prior to the presentation of a primary target (e.g., “This word is ‘pencil’, followed by holding up a different card and presenting the instruction “what word?”) to an instructional condition in which teachers presented secondary targets after delivering praise contingent on a correct response to a target stimulus (e.g., “Great work reading the word ‘book’! This word is ‘pencil”’). Although the participants did not demonstrate mastery-level responding to the secondary targets in either condition without direct teaching, they acquired the secondary targets more quickly than targets not presented in earlier learning trials. These results are consistent with other studies (e.g., Winterling, 1990) and indicate that presenting secondary targets in the antecedent or consequence portion of the learning trial may increase instructional efficiency.

Reichow and Wolery (2011) extended previous research on the effects of presenting secondary targets on the acquisition of sight words to children with an ASD. The experimenters examined the efficacy and efficiency of this strategy by comparing the number of sessions and time required to achieve mastery-level performance during progressive prompt-delay conditions with or without secondary targets. Their results indicated that presenting secondary targets in a progressive prompt-delay procedure was approximately twice as efficient compared to instruction without secondary targets.

However, Reichow and Wolery (2011) did not evaluate the point at which the participants acquired the secondary targets. That is, they presented secondary targets in the learning trials and evaluated whether participants acquired these targets following mastery-level responding to primary targets exposed to direct training. It may be beneficial to determine the point at which secondary targets are acquired during training and observe the rate of acquisition of these stimuli. This information may provide an estimate of the acquisition of secondary targets prior to the completion of training with primary targets, which may indicate whether additional training with secondary targets will be necessary. In addition, Reichow and Wolery, as well as other previous studies (e.g., T. D. Wolery et al., 2000), did not include a condition in which instructors presented secondary targets in the absence of instruction. Such a condition could be used to examine the minimal conditions under which learners may acquire secondary targets. Finally, to our knowledge, no studies involving secondary targets have evaluated participant behavior (e.g., imitating the teacher’s presentation of the secondary target) that may aid in the acquisition of secondary targets.

The current investigation sought to replicate and extend the extant literature on presenting secondary targets in learning trials in several ways. First, we assessed the efficiency of presenting secondary targets in the antecedent portion of learning trials with individuals with an autism spectrum disorder. Second, we compared the efficiency of presenting secondary targets in the antecedent and consequence portion of learning trials. To our knowledge, this study represents the first comparison between these conditions for participants with an autism spectrum disorder. Third, we included probes of the secondary targets during the training of primary targets to determine if and when the participants acquired the secondary targets. Fourth, we measured whether participants echoed the experimenter’s presentation of secondary targets. Fifth, we included a comparison condition in which we exposed participants to secondary targets in the absence of teaching (i.e., the secondary targets were not presented in learning trials).

Method

Participants and Setting

Four children with an autism spectrum disorder participated. Each child received his or her diagnosis from a multidisciplinary clinic specializing in the assessment of ASD. All children received early intervention services at a hospital-based clinic and had a history of training with prompt-delay procedures. However, none of the participants had previous exposure to the presentation of secondary targets within learning trials. Winnie was a Caucasian 7-year-old girl who had been diagnosed with autism and who used four- to six-word phrases to mand for or tact items spontaneously. Winnie answered simple social questions (e.g., “How are you?”) and completed fill-in-the-blank statements (e.g., “Twinkle, twinkle little” [child says “star”]). We conducted a Peabody Picture Vocabulary Test-4 (PPVT-4) with Winnie, and her age equivalent score was 3.6.

Kevin was a Caucasian 5-year-old boy who had been diagnosed with autism and who used two to four-word phrases mainly to mand for items. He responded correctly to a small number of fill-in-the-blank questions, identified common objects, and had a well-developed echoic repertoire. Kevin’s PPVT-4 results indicated an age equivalent score of 2.3. Dwight was a Caucasian 3-year-old boy who had been diagnosed with pervasive developmental disorder not otherwise specified and who spontaneously engaged in mands for items and activities using three or more words. Dwight also emitted at least 300 tacts of items, activities, and people, and he answered social questions (e.g., “How are you today?”). Rick was an African-American 6-year-old boy who had been diagnosed with autism and who spontaneously engaged in mands for and tacts of at least 300 items using four- to seven word phrases. He answered and initiated social questions and completed fill-in-the-blank statements.

We conducted all sessions in the participant’s typical therapy room. Each room contained a table, chairs, and plastic tubs in which we placed materials for the session. The therapist and participant sat adjacent to or across from each other at a table during all sessions. A secondary observer sat in a chair at or close to the table during a proportion of sessions.

Response Measurement and Interobserver Agreement

Observers recorded data using data sheets specifically prepared for each session. For each trial, the data sheet specified the target, the correct answer, and letter codes corresponding to participants’ (a) correct response, defined as the participant emitting the target response prior to the delivery of the controlling prompt; (b) incorrect response or no response, defined as the participant emitting an error of commission (i.e., responding incorrectly) or omission (i.e., nonresponding) prior to the delivery of the controlling prompt, respectively; (c) prompted correct response, defined as the participant providing the target response after the delivery of the controlling prompt; (d) prompted incorrect response or prompted no response, defined as the participant making an error of commission or omission following the delivery of the controlling prompt, respectively; and (e) correct echo (for conditions including secondary targets), defined as the participant correctly imitating the experimenter’s vocal model of the secondary target within 5 s. Data were collected on participants’ incorrect responses to make decisions to increase the prompt delay (described below). We recorded session duration using a digital handheld timer for Dwight and Rick, but not for Kevin and Winnie.

A second independent observer simultaneously collected data during at least 44% of the sessions in each condition, and we calculated agreement by comparing observers’ records on a trial-by-trial basis. We scored an agreement for trials that both observers coded identically. We divided the number of trials in agreement by the number of trials with agreements plus disagreements and converted the ratio to a percentage. Mean interobserver agreement for trials across all conditions was 98% (range, 50% to 100%) for Winnie, 98% (range, 67% to 100%) for Kevin, 99% (range, 83% to 100%) for Dwight, and 97% (range, 67% to 100%) for Rick.

Preference Assessment

The experimenter conducted a paired-choice preference assessment (Fisher et al., 1992) with each participant prior to the beginning of the evaluation to identify highly preferred food items. In addition, we conducted a daily multiple stimulus without replacement (MSWO; DeLeon & Iwata, 1996) assessment with the top five ranked items from the paired-choice assessment. The experimenter delivered the three most highly preferred food items following correct responding during training.

Pretest

Prior to baseline, the experimenter conducted pretests to identify target stimuli for each condition and participant. We created a pool of potential targets based on individualized intervention goals for each participant. These included tacts of pictures (for Kevin and Winnie) and intraverbal fill-in-the-blank statements (for Dwight and Rick). Pretest trials consisted of the experimenter holding up a picture card and asking, “What is it?” or presenting an antecedent verbal stimulus associated with the fill-in-the-blank statement (e.g., “The opposite of hot is —”). Participants had 5 s to respond. The experimenter presented each potential target four times per session in a random order. The experimenter did not provide feedback for correct or incorrect responses during the pretest; mastered tasks were interspersed on about every other trial.

Reinforcement was provided for correct responses to mastered tasks to maintain motivation.

We discarded all potential targets to which the participant responded correctly at least once; we pseudo randomly assigned the remaining targets to one of six or eight sets. For Winnie and Kevin, each set included three targets. Dwight’s and Rick’s sets included six intraverbal fill-in-the-blank statements. We equated stimulus sets by assigning stimuli to each condition based on the number of syllables contained in target responses and ensuring targets that sounded similar were not in the same set. We assigned sets to one of four conditions: (a) primary targets with secondary targets placed in the antecedent portion of the learning trial (hereafter referred to as the antecedent condition), (b) primary targets with secondary targets placed in the consequent portion of the learning trial (hereafter referred to as the consequence condition), (c) secondary targets in the absence of teaching primary targets (hereafter referred to as the secondary-targets only condition), and (d) primary targets in the absence of secondary targets (hereafter referred to as the primary-targets-only condition). We assigned two sets of targets to the antecedent and consequence conditions. For these conditions, one set served as the primary targets and the other as the secondary targets. The secondary-targets-only or primary-targets-only conditions each contained one set. Baseline control sets also were included for Rick and during a replication comparison for Dwight. A list of the targets in each set and condition is available in the supporting information or from the first author.

Design and General Procedure

We evaluated the effects of training with and without secondary targets on the acquisition of tacts and intraverbal fill-in-the-blank statements using an adapted alternating treatments design (Sindelar, Rosenberg, & Wilson, 1985). The treatment comparison was conducted twice with Dwight for replication purposes. We conducted one to 11 sessions per day, 1 to 5 days per week; all sessions consisted of 12 trials (excluding the presentation of the secondary targets with the exception of the secondary-targets-only condition).

The first two instructional sessions in the antecedent, consequence, and primary-targets only conditions included trials with a 0-s prompt delay. During these trials, the experimenter immediately provided an echoic prompt following the presentation of the nonverbal or verbal stimulus, depending on the target stimuli. Following sessions at a 0-s delay, the experimenter increased the prompt delay to 1 s. The experimenter subsequently increased the prompt delay by 1 s per session based on each participant’s pattern of responding. That is, we increased the prompt delay by 1 s if the participant engaged in no response (i.e., error of omission) for the majority (50%) of the unprompted incorrect responses. If the majority of the participant’s unprompted incorrect responses were incorrect responses (i.e., errors of commission), the prompt-delay value remained the same for the following session.

The experimenter implemented an error correction procedure contingent on errors in the antecedent, consequence, and primary-targets-only conditions. During error correction, the experimenter provided a vocal model of the correct response and an opportunity for the participant to respond. The experimenter provided affirmative statements (e.g., “yep”) contingent on the participant correctly echoing the vocal model prompt, and the experimenter repeated the trial until the participant provided an correct response (although correct responses during error correction were not included in the session data). The experimenter delivered praise and an edible item contingent on a correct response during error correction.

Teaching of the primary targets continued until the participant’s correct responses reached the mastery criteria. Kevin’s, Dwight’s, and Rick’s mastery criteria were three consecutive sessions with correct responses at or above 90% or two consecutive sessions at 100%. Winnie’s mastery criterion was two consecutive sessions with correct responses at or above 90%. Similar to Reichow and Wolery (2011), we conducted review sessions after every other instructional session for any condition in which a participant demonstrated mastery-level responding while teaching continued in the other conditions. Review sessions had the same format as instructional sessions.

We conducted a probe session 10 min to 120 min after each session of the antecedent, consequence, and secondary-targets-only conditions to measure participants’ acquisition of secondary targets. If the participant did not demonstrate mastery-level responding to secondary targets prior to mastery of all primary targets, we directly trained these secondary targets following mastery of all primary targets. Direct training was only necessary for Kevin.

Baseline. We conducted a minimum of three baseline sessions for each condition and extended baseline until participants demonstrated three consecutive sessions with a stable or decreasing trend in correct responding with a mean below 35%. For tact targets, the experimenter held up the target picture card and asked, “What is it?” For fill-in-the-blank statements, the experimenter presented the antecedent verbal stimulus that did not include the final word in the sentence (e.g., “The opposite of hot is —”). For both tasks, the participants had 5 s to respond. The experimenter did not provide any feedback for correct or incorrect responses. The experimenter presented targets an equal number of times in a random order during each session. For Winnie and Kevin, we conducted sessions under baseline conditions following mastery of the primary targets because they did not demonstrate mastery of the secondary targets during probes. This served as a baseline for teaching of the secondary targets.

Antecedent condition. Each trial consisted of the experimenter establishing ready behavior (e.g., ensuring that the participant’s body was still or prompting the participant to put his or her hands in the lap) and presenting a secondary target (e.g., holding up a picture and saying, “This is a seal”). The experimenter did not provide differential consequences for participant responses following the presentation of the secondary target. After approximately 3 s, the experimenter then presented the stimulus relevant to the primary target (e.g., held up a picture of a lion and asked, “What is it?”). The experimenter delivered a preferred edible and praise contingent on a correct response to the primary target. If the participant engaged in an error, the error-correction procedure was implemented (as previously described). We randomly assigned primary and secondary targets to trials (i.e., we did not systematically pair primary and secondary targets).

Consequence condition. Immediately following the delivery of reinforcement for responding to the primary target, the experimenter presented the secondary target. That is, while the child consumed the edible item, the experimenter presented the secondary target (i.e., held up a picture and said “This is a seal”). All other procedures were identical to the antecedent condition.

Secondary-targets-only condition. The experimenter presented secondary targets in the absence of primary targets. That is, the experimenter established ready behavior, presented the secondary target, recorded the participant’s response to the secondary target, and moved to the next trial. The experimenter did not provide any differential consequences for responses to secondary targets, as in the other conditions and avoided incidental responses (e.g., a smile) during sessions. Contingent on appropriate behavior (e.g., sitting quietly, making eye contact), the experimenter provided a preferred edible and praise about every other trial during the intertrial interval to maintain participation in the session. This condition was designed to examine the effects of presenting secondary targets in the absence of instruction for primary targets and programmed consequences.

Primary-targets-only condition. The training procedures were identical to those described above (see antecedent and consequence conditions), with the exception that no secondary targets were included in trials. This condition was designed to measure acquisition under teaching practices typically encountered in early intervention programs. Thus, this condition allowed a comparison of the number of sessions required to reach the mastery criteria when we did not present secondary targets in learning trials.

Secondary-target probes. We measured the emergence and acquisition of secondary targets presented in the antecedent, consequence, and secondary-targets-only conditions during ongoing training with primary targets. Depending on the participant’s schedule, the experimenter conducted a probe of secondary targets following every one to three sessions of training in other conditions. We conducted these probes using the procedures described in baseline.

Control condition. The experimenter conducted these sessions using procedures identical to those in the baseline condition. We included this condition in Rick’s treatment comparison and in Dwight’s second treatment comparison as a control condition. We believed this addition was necessary as the demonstration of experimental control for Winnie’s treatment comparison and for Dwight’s first treatment comparison was relatively weak. During these evaluations, Winnie and Dwight unexpectedly acquired the targets from the secondary-targets-only condition without direct teaching; we expected direct training to be necessary. Thus, we cannot rule out the effects of maturation or repeated exposure as an explanation for these gains.

Results

Figures 1, 2, and 3 show the results of training in each condition for the participants. In each figure, the top panel shows the participants’ percentage of correct responses to the primary targets across conditions. The participants’ percentage of correct responses during probes of the secondary targets is displayed in the bottom panel of each figure.

Figure 1: The percentage of correct responses to primary and secondary targets in each condition for Winnie and Kevin. Lines connecting data points were removed for review sessions that were conducted with mastered stimuli while we continued to conduct training in other conditions. BL ¼ baseline

During baseline, the participants’ correct responses were at or near zero across primary and secondary targets in all conditions. During training, Winnie acquired the primary targets in the antecedent, consequence, and primary-targets-only conditions in 6, 11, and 9 sessions, respectively (Figure 1, left column, top panel). During conditions that included the presentation of secondary targets, Winnie almost always echoed the experimenter’s vocal model of secondary targets. She correctly echoed secondary targets on 96%, 98%, and 93% of opportunities during the antecedent, consequence, and secondary-targets-only conditions, respectively (data not depicted in Figure 1). She did not master the secondary target sets prior to acquisition of the primary targets, although there was improvement in each set. However, Winnie met the mastery criteria for all secondary targets during the subsequent baseline condition (Figure 1, left column, bottom panel). As such, direct training of secondary targets was unnecessary.

Figure 1 displays Kevin’s responding to primary and secondary targets across conditions. Kevin showed mastery-level responding to the primary targets in 18, 21, and 17 sessions in the antecedent condition, consequence condition, and the primary-targets-only condition (Set 1), respectively (Figure 1, right column, top panel). Kevin correctly echoed secondary targets during the antecedent, consequence, and secondary-targets-only conditions during 90%, 85%, and 90% of opportunities, respectively. Unlike Winnie, Kevin did not reach criteria-level performance for the secondary targets prior to or immediately following mastery of the primary targets. Therefore, we directly taught all secondary targets as well as a new set of primary targets in the primary-targets-only condition (Set 2). Kevin acquired the second set of primary targets in the primary-targets-only condition in nine sessions, but he required substantially more sessions (22, 14, and 22 sessions, respectively) to master the secondary targets from the antecedent, consequence, and secondary-targets-only conditions.

During training, Rick acquired the primary targets in the antecedent, consequence, and primary-targets-only conditions (Figure 2, top panel) in five, six, and five sessions, respectively (Figure 2, top panel). He correctly echoed the secondary targets on 100% of opportunities during the antecedent, consequence, and secondary-targets-only conditions. Rick demonstrated mastery of the secondary targets during the training of primary targets; in fact these secondary targets were mastered in a similar number of sessions as the primary targets (Figure 2, bottom panel).

Figure 2: The percentage of correct responses to primary and secondary targets in each condition for Rick. Lines connecting data points were removed for review sessions that were conducted with mastered stimuli while we continued to conduct training in other conditions.

Dwight’s responding to primary and secondary targets during his first treatment comparison is displayed in Figure 3 (left column). Dwight mastered the primary targets in the antecedent, consequence, and primary-targets-only conditions in 7, 10, and 8 sessions, respectively (Figure 3, left column, top panel). Similar to the other participants, he correctly echoed the secondary targets during 100%, 99%, and 100% of opportunities in the antecedent, consequence, and secondary-targets-only conditions, respectively. Dwight acquired the secondary targets presented within the antecedent and consequence conditions in five and nine probe sessions, respectively. Dwight reached criteria level performance for the secondary-targets-only condition in just four probe sessions (Figure 3, left column, bottom panel).

Figure 3 (right column) contains the replication data for Dwight. He acquired the primary targets in seven sessions in the antecedent, consequence, and primary-targets-only conditions (Figure 3, right column, top panel). Dwight echoed secondary targets during 100% of opportunities in the antecedent, consequence, and secondary-targets-only conditions. Dwight also showed mastery-level responding to all secondary targets during the training of primary targets and in a number of sessions similar to that of the primary targets (Figure 3, right column, bottom panel). He acquired the secondary targets presented within the antecedent and consequence conditions in eight and four probe sessions, respectively. Dwight reached criteria level performance for the secondary-targets-only condition in six probe sessions.

Figure 3: The percentage of correct responses to primary and secondary targets in each condition for Dwight. Lines connecting data points were removed for review sessions that were conducted with mastered stimuli while we continued to conduct training in other conditions

To make additional comparisons regarding instructional efficiency, we calculated the training time per acquired target for each condition by dividing the total training time for each condition
by the number of acquired targets for Rick and Dwight (session duration data were unavailable for Kevin and Winnie). For both participants, conditions involving secondary targets required
the least amount of training time per target (M ¼ 1 min 42 s, M ¼ 1 min 50 s, M ¼ 2 min in the antecedent, consequence, and secondary-targets-only conditions, respectively, for Rick;
M ¼ 2 min 31 s, M ¼ 3 min, M ¼ 1 min 53 s in the antecedent, consequence, and secondary-targets-only conditions, respectively, for Dwight’s first treatment comparison; and M ¼ 3 min
1 s, M ¼ 2 min 41 s, M ¼ 3 min 6 s in the antecedent, consequence, and secondary-targets-only conditions, respectively, for Dwight’s second treatment comparison) relative to the primary-targets-only condition, which is the procedure that is used often during instruction in early intervention programs (M ¼ 3 min 17 s for Rick; M ¼ 5 min 2 s for Dwight’s first comparison;
and M ¼ 4 min 51 s for Dwight’s second comparison).

Discussion

We evaluated the effects of presenting secondary targets in learning trials to teach four children with ASD to tact common objects and respond to intraverbal fill-in-the-blank statements. Three of the four participants acquired the secondary targets without explicit instruction. Presenting secondary targets in learning trials also was a more efficient approach to intervention for three of the four participants because they mastered double the number of stimuli in conditions that included primary and secondary targets compared to conditions that included primary targets only in a similar amount of training time. Furthermore, participants frequently echoed the experimenter’s vocal model of secondary targets in the absence of prompting or reinforcement of that response. These results provide additional evidence to support the use of these teaching procedures during early intervention programming with children with an ASD.

Results are similar to those of Reichow and Wolery (2011) who demonstrated that presenting secondary targets as a consequent event during learning trials was more efficient for teaching sight words than similar teaching protocols without secondary targets. The present study also extended previous research by demonstrating that, for at least some learners with an ASD, (a) presenting secondary targets in the antecedent portion of learning trials may produce outcomes similar to presenting secondary targets in the consequence portion, (b) some children may acquire targets in a condition in which most of the components of DTI are omitted (i.e., the secondary-targets-only condition in which participants were not required to emit a response, and the experimenter did not deliver controlling prompts or provide reinforcement for correct responses; as demonstrated by Winnie, Dwight, and Rick), and (c) collecting ongoing probe data are useful for evaluating the emergence and acquisition of secondary targets during primary target instruction.

Similar to T. D. Wolery et al. (2000), our results suggest that differences in the results of antecedent and consequence conditions were often minimal. The selection of one of these arrangements may be left to the instructional programmer’s discretion. However, it seems preferable to evaluate the learner’s preference for these different experimental arrangements. Future studies could evaluate children’s preference for teaching arrangements that do or do not include the presentation of secondary targets. It is also possible that differences in the efficiency of procedures may be related to the learner’s response characteristics (Kodak et al., 2011) or instructional history (Coon & Miguel, 2012). Identifying the predictors of efficient teaching procedures are important areas for research.

By evaluating the emergence and acquisition of secondary targets during ongoing instruction for primary targets, we were able to determine the point at which participants acquired the secondary targets. Surprisingly, several participants acquired the secondary targets before mastering the primary targets that were exposed to direct training. To our knowledge, only Anthony et al. (1996) evaluated the acquisition of secondary targets during primary-target instruction. Anthony et al. found that participants did not demonstrate mastery-level responding of secondary targets during probes prior to completing training of the primary targets. Thus, the authors recommended against conducting probes. However, our results suggested that some individuals may acquire secondary targets prior to primary targets; probes are necessary to identify this acquisition. If probes indicate the participant has mastered the secondary targets before the primary targets, it may be possible to introduce novel secondary targets; however determining the frequency of these probes to maximize teaching efficiency remains an important area for future research.

We, as well as previous researchers (Reichow & Wolery, 2011), hypothesize that a generalized imitation repertoire may be important in the acquisition of secondary targets. All participants echoed the secondary targets consistently, but the acquisition of secondary targets was variable. Winnie, Dwight, and Rick learned the secondary targets without direct training, whereas Kevin required direct training. It is important to note that Kevin’s attendance during the evaluation was inconsistent due to illness and family vacations; therefore, his results may not accurately reflect the role of echoic behavior in the success of this instructional procedure.

Acquisition of secondary targets may be related to a demand characteristic (M. Wolery et al., 1993). We presented all of the secondary targets within a similar instructional context as primary targets such that participants had prior histories of reinforcement for attending to stimuli and responding to instructions. Another, but similar, way to conceptualize responding to secondary targets is that participant’s demonstrated generalized imitation (Baer, Peterson, & Sherman, 1967). We reinforced imitative behavior following prompts during the training of primary targets. The reinforcement contingencies for imitating the experimenter’s vocal model may have been indiscriminable across conditions and targets.

Several limitations of the current evaluation should be mentioned. We did not collect treatment integrity data and recommend future studies collect these data, particularly in relation to the experimenter’s responses to the secondary targets, to ensure the experimenter does not reinforce participant responses during probes. Also, the demonstration of experimental control for Winnie’s and Dwight’s first treatment comparison was relatively weak, as they unexpectedly acquired the targets from the secondary-targets-only condition without direct teaching.

Thus, we cannot rule out the effects of maturation or repeated exposure as an explanation for these gains. We enhanced the demonstration of experimental control for Rick’s treatment evaluation and for Dwight’s second treatment comparison by including a control condition, controlling for the effects of repeated exposure to the materials and for maturation effects.

There was also overlap in responses taught within Dwight’s first treatment comparison.

More specifically, we included some symmetrical opposites across conditions (e.g., “The opposite of back is —” was a primary target in the antecedent condition, and “The opposite of front is —” was a primary target in the consequence condition). Although we cannot exclude the possibility that teaching of one response may have resulted in behavior change in another condition, it should be noted that Dwight consistently failed to demonstrate symmetrical intraverbal responses during his typical educational programming. Nonetheless, such relations are better avoided in future research.

Finally, there was minimal within-subject replication in this study (Dwight only). During Dwight’s first treatment comparison, the secondary-targets-only condition was most efficient, whereas the consequence condition was most efficient in the second treatment comparison. This difference highlights the importance of within-subject replications; it is unclear if the results yielded from each participant represent a clear bias to one teaching procedure for that participant. We recommend additional within subject replications in future research.

Additional research also could evaluate the outcome of exposing individuals to a condition analogous to our secondary-targets-only condition in less structured teaching settings (e.g., home, community outings). Outcomes of these studies will help demonstrate the generality of effects of the presentation of secondary targets only and may aid in the identification of behavioral processes responsible for the acquisition of secondary targets and help guide clinicians in determining the types of learners for whom these instructional practices are likely to be most effective.

References

Anthony, L., Wolery, M., Werts, M. G., Caldwell, N. K., & Snyder, E. D. (1996). Effects of daily probing on acquisition of instructive feedback responses. Journal of Behavioral Education, 6, 111–133. doi: 10.1007/ BF02110228

Baer, D. M., Peterson, R. F., & Sherman, J. A. (1967). The development of imitation by reinforcing behavioral similarity to a model. Journal of the Experimental Analysis ofBehavior,10,405–416.doi:10.1901/jeab.1967.10-405

Coon, J. T., & Miguel, C. F. (2012). The role of increased exposure to transfer of stimulus control procedures on the acquisition of intraverbal behavior. Journal of Applied Behavior Analysis, 45, 657–666. doi: 10.1901/ jaba.2012.45-657

Cromer, K., Schuster, J. W., Collins, B. C., & GrishamBrown, J. (1998). Teaching information on medical prescriptions using two instructive feedback schedules. Journal of Behavioral Education, 8, 37–61. doi: 10.1023/A:1022812723663

DeLeon, I. G., & Iwata, B. A. (1996). Evaluation of a multiple-stimulus presentation format for assessing reinforcer preference. Journal of Applied Behavior

Analysis, 29, 519–532. doi: 10.1901/jaba.1996.29-519

Fisher, W., Piazza, C. C., Bowman, L. G., Hagopian, L. P., Owens, J. A., & Slevin, I. (1992). A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilities. Journal of Applied Behavior Analysis, 25, 491–498. doi: 10.1901/jaba. 1992.25-491

Ingvarsson, E. T., & Hollobaugh, T. (2011). A comparison of prompting tactics to establish intraverbals in children with autism. Journal of Applied Behavior Analysis, 44, 659–664.

Kodak, T., Fisher, W. W., Clements, A., Paden, A. R., & Dickes, N. R. (2011). Functional assessment of instructional variables: Linking assessment and treatment. Research in Autism Spectrum Disorders, 5, 1059–1077. doi: 10.1016/j.rasd.2010.11.012

Reichow, B., & Wolery, M. (2011). Comparison of progressive prompt delay with and without instructive feedback. Journal of Applied Behavior Analysis, 44, 327– 340. doi: 10.1901/jaba.2011.44-327

Sindelar, P. T., Rosenberg, M. S., & Wilson, R. J. (1985). An adapted alternating treatments design for instructional research. Education and Treatment of Children, 8, 67–76.

Werts, M. G., Wolery, M., Holcombe, A., & Frederick, C. (1993). Effects of instructive feedback related and unrelated to the target behaviors. Exceptionality, 4, 81–95. doi: 10.1207/s15327035ex0402_2

Winterling, V. (1990). The effects of constant time delay, practice in writing or spelling, and reinforcement on sight word recognition in a small group. Journal of Special Education, 24, 101–116. doi: 10.1177/ 002246699002400108

Wolery, M., Ault, M. J., Gast, D. L., Doyle, P. M., & Mills, B. (1990). Use of choral and individual attentional responses with constant time delay when teaching sign word reading. Remedial and Special Education, 11, 47–58.

Wolery, M., Doyle, P. M., Ault, M. L., Gast, D. L., Meyer, S., & Stinson, D. (1991). Effects of presenting incidental information in consequent events on future learning. Journal of Behavioral Education, 1, 79–104. doi: 10.1007/BF00956755

Wolery, M., Werts, M. G., & Holcombe, A. (1993). Reflections on “effects of instructive feedback related and unrelated to the target behaviors.” Exceptionality, 4, 117–123. doi: 10.1207/s15327035ex0402_5 Wolery, T. D., Schuster, J. W., & Collins, B. C. (2000). Effects on future learning of presenting non-target stimuli in antecedent and consequent conditions. Journal of Behavioral Education, 10, 77–94. doi: 10.1023/A:1016679928480

Received July 2, 2012
Final acceptance March 8, 2013
Action Editor, Jeffrey Tiger

Selection-Based Imitation

Jan 5, 2026

Selection-Based Imitation
A Tool Skill in the Development of Receptive Language in Children With Autism

Stein K. Lund
Contingency Analysis/Perspectives Corporation

Abstract

Receptive language is a basic behavioral repertoire that many children with autism have difficulties acquiring. This difficulty may be caused by several factors suggesting the need for case-by-case analysis and the development of multiple intervention strategies. This paper outlines a strategy that has been effective in establishing receptive labeling in some children for whom conventional methods proved ineffective. The present strategy emphasizes the development of tool skills that are conjectured to subserve receptive labeling. These tools skills are developed by teaching a form of imitation that may be termed “selection-based imitation” (SBI). The present strategy should be recognized as clinically based and may be subjected to more rigorous investigation and further refinements.

Keywords: receptive labeling; “selection-based imitation;” case-by-case analysis

Receptive language may be defined as a basic behavioral repertoire that consists of responding non-verbally to spoken words in accordance with conventions in a verbal community. Impairment of receptive language or more generally, language comprehension, is prevalent in children with autism (Rutter & Schopler, 1987; Waterhouse & Fein, 1989). A considerable proportion of these children have difficulty acquiring even the basic forms of receptive language such as simple correspondence between words and objects (i.e., receptive object labeling). This difficulty may be caused by several different factors suggesting the need for case-by-case analysis and the development of multiple intervention strategies. Thus, a continuous challenge for clinicians is to apply the concepts and principles of behavior analysis in creative, yet systematic and coherent ways.

This paper outlines a strategy that has proven effective clinically in teaching receptive labeling to some children where common methods were unsuccessful (for a description of common methods see for instance Leaf and McEachin, 1999; Lovaas, 1981). The present strategy emphasizes the development of a form of imitation in which the instructor points to a stimulus (e.g., picture) displayed in one array and where the child imitates by pointing to the corresponding stimulus in a separate array. In this interaction the response topography remains the same across trials whereas the stimulus pointed to changes each time. Since this form of imitation does not involve distinct response topographies, it may be termed “selection-based imitation” (SBI).¹ SBI involves several tool skills conjectured to subserve receptive labeling including sustained attention to task, observation of other people’s interaction with stimuli (i.e., pointing), scanning (including shifting of attention between stimulus sets) and inhibition of “impulsive” (prepotent but incorrect) responding.² By establishing SBI these skills are strengthened prior to an attempt of teaching word-object relations.

Behavioral engineering and levels of analysis

When attempting to construct a basic behavioral repertoire such as receptive labeling, it may be useful to distinguish between two levels of analysis. One level focuses on the target repertoire as an integrated routine in context, that is, a composite act defined in terms of its functional relation to environmental variables. This may be termed the phenomenological level. The other level is concerned with lower-level strata or sub-components of the repertoire in question (as defined at the phenomenological level). In other words, this level is concerned with the internal structure of the repertoire including the processes and operations sufficient in engendering the relevant properties of the phenomenological level. This may be termed the implementation level. Phenomenological level and implementation level are not proposed as technical terms but as helpful, heuristic concepts with respect to behavioral engineering.³

Behavior may not have a necessary structure other than trivial (Baer, 1982) and one often finds that a repertoire may be established through different program sequences, different constellations of components and different instructional strategies. In other words, the implementation level of a behavioral repertoire may differ across individuals. The focus of the implementation level is therefore to provide pragmatic or sufficient solutions rather than the uncovering of essential, immutable structures. Consequently, the implementation level may include components that are “unnatural” or highly contrived. However, construction of functional behavior does not require that each component, step or subroutine must be immediately functional or congruent with natural contingencies. In “behavioral engineering,” any strategy or design that will accomplish the goal may be employed.

Receptive labeling deconstructed

Receptive language may be defined as responding non-verbally to spoken words in accordance with conventions in a verbal community. Receptive labeling of objects may be described as: Hear name X – select object Y. This relation can be established when the speaker names an object while indicating its location (e.g., looking or pointing) and providing reinforcement when the child orients toward it (e.g., hears the name “apple”- points to apple). Conversely, if the child orients toward a different object, orients toward the apple when another word is presented, or does not respond, reinforcement is not provided. This idealized contingency is illustrated in Table 1.

Table 1.
Contingencies Involved in Receptive Labeling

Verbal stimulus	Response	Non verbal stimulus	Consequence
(Spoken word)		(Object)
“Apple”	Orients	Apple	Reinforcement
“Apple”	Orients	Orange	No reinforcement
Orange	Orients	Apple	No reinforcement
“Apple”	None	None	Additional cues/assistance

Since a word-object relation is arbitrary (i.e., the word and the object share no common perceptual elements) each label must be learned individually. However, through several interactions the child may acquire new relations more rapidly. Bateson (1972) coined the term “deutro-learning” to refer to a progressive increase in learning within a domain. Along with dimensions such as retention, proficiency and generality, “deutro-learning” is an important consideration at the phenomenological level.

Although receptive labeling is a relatively simple behavioral process, it is not necessarily simple from the point of view of a naïve language learner. During acquisition the learner is faced with several problems that may be referred to collectively as “the problem of induction” (Markman, 1989). Markman described three related issues. First,

When a child hears a word used to label an object, for example, an indefinite number of interpretations are possible for that word. The child could think that the speaker is labeling the object as a whole, or one of its parts, or its substance, or its color, texture, size, shape, position in space, and on and on. (Markman, 1989, p. 8)

However, the typically developing child does not have a well-developed ability to analyze (i.e., dissect) objects into components or properties but is inclined to regard the whole object as a referent for the word. This bias has been termed the whole object assumption (Markman, 1989).

Secondly, the child must extend a word to objects similar to those present during the initial interaction. For instance, if the child has learned to identify a cat when hearing the word “cat,” he must extend the word to objects with perceived overall similarity such as different cat, a dog or a raccoon rather than to objects that tend to co-occur with cats such as a bowl of milk or a litter box. Typically developing children commonly generalize the word to objects with perceived overall similarity as opposed to objects that are related thematically. This bias has been termed the taxonomic assumption (Markman, 1989).

Lastly, in order to constrain word-object relations and sustain stability of acquired relations, the child must initially refrain from attributing two different labels to the same object. Thus, if a novel word is uttered in the presence of an object of which the child already has learned a name, he tends to reject the new word as an object label. For instance, if the child has learned that an object is called “apple,” he is likely to attribute any new word (e.g., “red”) directed towards an apple to another feature of the object such as its shape, some part, or its color. This bias has been termed the mutual exclusive assumption (Markman, 1989).⁵

Applying the two levels of analysis to receptive labeling, the phenomenological level includes the act of identifying an object given a spoken word in accordance with the aforementioned contingency (see Table 1), whereas the implementation level includes the various learning constraints (whole object assumption, taxonomic assumption and mutual exclusive assumption) and lower level strata including scanning, responding to multiple cues and auditory discrimination (Table 2).

Table 2.
Phenomenological and Implementation Levels of Receptive Labeling

Phenomenological level	Implementation level
Responding conventionally to spoken words	Whole object assumption
Word-object relation; Hear X orient to Y)	Taxonomic assumption
	Mutual exclusive assumption
	Responding to multiple cues
	Scanning
	Auditory discrimination

Children with autism and the implementation level

Apparently, many children with autism demonstrate significant deficits in pertinent tool skills (i.e., implementation level skills) and may therefore fail to acquire receptive labeling in a normative or even highly modified instructional context (e.g., discrete trial instruction). Moreover, many children may not develop these tool skills under such conditions despite abundant practice opportunities. Thus, when these skills are not present they must be targeted explicitly.

Children with autism often attend only to a restricted aspect of a stimulus rather than its overall feature. This “stimulus overselectivity” may account for many learning problems in this population (Lovaas, et. al., 1971) and it is a crucial concern when teaching receptive labeling. For instance, stimulus overselectivity may block the child from attributing the word to the whole object. In such cases the child would not appreciate the abstract property of “sameness” as an overall feature of stimuli and hence have difficulty extending the word taxonomically as manifested in limited and/or idiosyncratic stimulus generalization. Additionally, if the child has difficulty responding to multiple cues, difficulty scanning and shifting attention between stimuli, the conventional contingency of receptive labeling may be ineffective and could even be counterproductive.

Stimulus over selectivity within the visual domain may be ameliorated through a systematic sequence of matching-to-sample (MTS) in which stimulus features change gradually. Subtle changes from identical to non-identical stimuli, including cross-dimension (i.e., picture to object) may improve the child’s ability to focus on overall structure as the basis for perceptual categorization. Thus, matching-to-sample may be a critical component in developing receptive labeling. For instance, Barnard and Eisenhart (2001) presented a case study in which the reintroduction of a systematic matching-to-sample sequence after a period of receptive language training enhanced acquisition and retention of receptive labeling in a child with autism.

During acquisition, it is imperative that the child looks at the object to which the speaker points. However, many children with autism do not attend to the speaker’s gestures or orientation. When the instructor utters a word while pointing to an object, there is no guarantee that the gesture orients the child to the relevant object. In an effort to ameliorate this deficit one may teach the child to respond reliably to visual directives and to shift attention flexibly from one stimulus to another.

Selection-Based Imitation

Selection-based imitation (SBI) encompasses several of the aforementioned tool skills as it integrates matching-to-sample and imitation. Proficient SBI may therefore aid in the acquisition of receptive labeling for children who struggle to acquire this repertoire.

In SBI the instructor points to a stimulus displayed in one array and the child imitates by pointing to the corresponding stimulus in a separate array. The child is neither required to discriminate between, nor perform distinct response topographies (i.e., clapping, waving and standing up). In SBI the response topography remains the same in every trial whereas the stimulus pointed to changes each time. As opposed to imitation of behavioral topographies, SBI is a joint product of two stimuli: The stimulus to which the model points establishes the evocative effect of another stimulus (i.e., the corresponding stimulus in a separate set), which functions as a discriminative stimulus (SD) for the child’s response.

In SBI the instructor “nominates” the target stimulus by pointing to it. Thus, the child must attend to the part of the environment of which another person interacts. This differs from imitation of response topographies where attention is directed to the other person’s movement per se. Moreover, the child must shift attention flexibly from the instructor’s stimulus field to his own field.

A sequence of implementation

There may be several different sequences that can establish SBI (i.e., the implementation level may vary across children). The present sequence is based on considerable clinical experience with many children. It is organized into several incremental phases where the last phase consists of transferring stimulus control from a visual to an auditory stimulus (Table 3).

Table 3.
Sequence of Implementation

1. Linear configuration

2. Field expansion

3. Linear configuration/different positions

4. Non-linear configuration

5. Two steps

6. Transfer to receptive labeling

Linear configuration. The instructor and the child each have an array of three pictures depicting three different objects. The arrays are arranged so the pictures correspond both horizontally and vertically in sequential order as illustrated in Figure 1. The arrays may be arranged so the pictures face right side up for both the child and the instructor or both arrays may be facing right side up relative to the child. The instructor and the child sit directly across each other at a table.

Figure 1. Linear configurations.

The instructor delivers the instruction “do this,” points to one of the pictures in her array and manually guides the child to point to the corresponding picture in his array. By using the generic instruction “do this” as opposed to the object name, one ensures that every trial is identical except for the selection of the picture, which changes each trial. This arrangement may increase the likelihood that the child’s behavior comes under control of the relevant feature of the task.

It may be helpful to arrange for an additional instructor to administer the manual prompting from behind the child. Prompting should be faded as soon as the child starts to respond to the target instruction. The fields should be reconfigured frequently between trials. Moreover, it is important to maintain short instructional intervals, a high pace of instruction and to afford the child a break based on peak performance. For instance, when the child performs the first independent response, the instructional interval may be terminated. Criteria should be increased systematically contingent on the child’s progress and eventually, the child should be able to perform four to six consecutive imitations before earning a break.

The objective in this phase is to ensure that the child attends reliably to the instructor’s response and shifts attention flexibly between the two stimulus arrays.

A common problem during this phase is that the child may point to the instructor’s array as opposed to his own. Another problem is that the child may fail to attend to the instructor’s pointing.

Two strategies may be effective in shaping the child’s response to the appropriate array. One strategy consists of increasing the distance between the two arrays in order to make the instructor’s field inaccessible to the child. This proximity prompt may induce the child to “settle” for the closer of the two fields. The distance between the two arrays can be decreased gradually. A second strategy consists of blocking access to the instructor’s array by gently guiding the child’s hand toward his own. A third viable strategy consists of holding the target picture above the table while prompting the child manually to point to the corresponding picture in his array. The picture should be brought closer to the table over successive trials and eventually placed among the other pictures. The efficacy of this strategy derives from initially isolating the target stimulus and shaping discrimination in a nearly errorless fashion.

Field expansion. In this phase the stimulus array should initially be increased to four pictures while maintaining a horizontal and vertical configuration as described in the first phase. The field size (i.e., number of pictures in the array) should be increased gradually to eventually include at least six pictures as illustrated in Figure 2. The procedure described in the previous phase may be employed. The present phase should continue until the child imitates proficiently (i.e., four to six consecutive correct responses).

Figure 2. Field expansion

The objective in this phase is to further strengthen scanning by gradually increasing the field size. The response requirements are the same as in the previous phase and the field size is increased incrementally. Therefore, there are typically no problems in this phase. However, due to increased scanning requirement, the response latency may be slightly longer. If the child struggles with larger fields, the instructor may decrease the field size and the instructional pace. These dimensions should be altered systematically as the child becomes more proficient.

Linear configuration/different positions. In this phase the fields should be arranged so the pictures no longer correspond with respect to their positions in the arrays as illustrated in Figure 3. When introducing this phase, it may be necessary to scale back to a field of three pictures and gradually increase to a field of six (Figure 4).

Figure 3. Linear configuration/different positions.

Figure 4. Field expansion.

This phase is designed to solidify scanning (e.g., shifting of attention) and prevent development of “position pointing” in which the child responds to the position of the instructor’s finger as opposed to the target picture.

As indicated, a common problem in this phase is that the child may respond to the position of the instructor’s finger as opposed to the target picture. Alternatively, the child may first point to the picture corresponding to the position of the instructor’s finger and then switch to the correct picture. Clinical experience indicates that even a couple of reinforced trials could establish this pattern of “self-correction.”

To ameliorate these patterns, three strategies may be utilized. One strategy consists of scaling back to two pictures and increasing the field size when the child performs proficiently. Another strategy consists of blocking the child’s response to permit sufficient scanning time. For instance, the child’s response may be prevented physically until he observes the instructor’s response and shifts attention (i.e., gaze) to his own array. A third strategy consists of interrupting “position pointing” and initiating a new trial after a very brief delay (two to three seconds).

Non-linear configuration. In this phase the two arrays should be arranged in an incrementally more randomized fashion as illustrated in Figure 5. In the previous phases the child was required to scan horizontally only, whereas in this phase he must scan both horizontally and vertically. The field configuration should be changed incrementally to minimize errors.

Figure 5. Non-linear configuration.

There are typically no problems in this phase. However, the response latency may be slightly longer. If response latency is considerably longer than in the previous phase, one could return to a linear configuration and build proficiency through more incremental randomization of the arrays. Alternatively, this phase may be introduced by randomizing one of the arrays while maintaining linear configuration of the other.

Two steps. When the child is proficient with single-step imitation, two-step imitation may be introduced. The objective in this phase is to increase flexible scanning and promote attention to more durable and complex antecedent conditions. Although this step may not always be necessary, it is designed to promote flexible scanning and increase attending.

It is critical to ensure that the child does not respond before the instructor completes both steps. Initially, the instructor may insert a short delay between the responses. For instance, the instructor points to one picture and delays the second response until the child performs his first response. This delay must be faded as soon as possible. The field size should increase gradually until the child can perform proficiently in a field of four to six pictures.

A common problem in this phase is that the child responds only to the first or the last picture. Alternatively, the child may point to the pictures in reverse order (i.e., pointing to the last picture first and the first picture last). As discussed, to prevent this pattern, a delay may be used in which the instructor points to one picture and delays the second response until the child performs his first response. Another strategy consists of preventing the child from responding until the instructor completes both steps (e.g., holding the child’s hand). A third possible strategy consists of holding the finger on the second picture until the child completes the chain.

Transfer to receptive labeling. In this phase the verbal instruction should be changed from “do this” to the label name (e.g., “car”) while continually pointing to the picture. However, in this phase pointing must be recognized as a prompt that must be faded successively as stimulus control transfer to the spoken word. Prompting may be faded by gradually increasing the distance between the instructor’s finger and the picture (e.g., from touching the picture to a distance of 1012 inches).

Initially, the instructor may select two labels to be taught in discrimination. When the child responds correctly to the first label without prompting, the second label should be introduced. The same fading strategy should be used for the second label. However, when teaching the second label the instructor may intersperse the trials by presenting randomly the first and the second label. For example, the instructor names the second picture while pointing, and intersperse the trails by occasionally naming the first label. This procedure should be maintained until the child responds correctly to a random presentation of the two labels without prompting. If the child responds inconsistently during random rotation, pointing may be inserted shortly after the verbal instruction to function as delayed prompting (c.f., Touchette & Howard, 1984). Alternatively, the second label may be practiced in isolation, thus postponing random rotation until prompting is faded. When the child has acquired three to four receptive labels, the instructor’s array may be removed and new labels may be taught through more conventional methods (e.g., Leaf & McEachin, 1999).

Discussion

SBI involves several tool skills that are subservient of receptive labeling and it can therefore be used as a basis for developing this repertoire. The sequence of implementation outlined in this paper is based on common behavioral principles such as shaping of antecedent stimulus conditions (i.e., stimulus topography shaping) and transfer of stimulus control through delayed prompting. Moreover, the present description includes vague and relative terms such as “proficient,” “as soon as possible,” “gradual,” “considerably longer,” and “sufficient scanning time.” However, it is difficult to express the many nuances and dimensions in an unambiguous or absolute manner. Despite this vagueness, the present outline may be of practical value to clinicians.

Although the strategy outlined in this paper has proven effective for several children it is not always sufficient. For many children other strategies must be employed in order to ameliorate deficiencies in receptive labeling. Nevertheless, the theoretical analysis offered in this paper illuminates some pertinent consideration when teaching receptive labeling, such as deficiencies of learning constraints, scanning (flexible shifting of attention between stimuli), and responsiveness to visual directives (i.e., pointing). Thus, the present analysis may serve as a basis for systematic problem solving and development of alternative instructional strategies.

References

Abrahamsen, E.P., & Mitchell, J.R. (1990). Communication and sensorimotor functioning in children with autism. Journal of Autism and Developmental Disorder, 20, 75-86.

Baer, D. M. (1982). The imposition of structure on behavior and the demolition of behavioral structure. In D. J. Bernstein (Ed.) Response structure and organization. Nebraska Symposium on Motivation. Lincoln, NE: University of Nebraska Press.

Barnard, J. C., & Eisenhart, D. (2001, May). Building a foundation: An analysis of an early match-to-sample program sequence. Paper presented at the 27^th Annual Convention of the Association for Behavior Analysis. New Orleans, Louisiana.

Bateson, G. (1972). Steps to an ecology of mind. Ballantine Books, New York

Bondy, A., & Frost, L. (2001). A picture’s worth: PECS and other visual communication strategies in autism. Woodbine House, Bethesda , MD.

Catania, A., C., Sveinsdottir, I., DeLeon, I., G., Christensen, A., & Hineline, P. N. (2002). The paradoxical vocabularies of topography-based and selection-based verbal behavior. European Journal of Behavior Analysis, 3, 81-85.

Leaf, R., & McEachin, J. (1999). A work in progress: Behavior management strategies and a curriculum for intensive behavioral treatment of autism. Autism Partnership. DRL Books, L.L.C. New York, NY.

Lovaas, O. I. (1981). Teaching developmentally disabled children: The me book. Pro-Ed. Austin, TX.

Lovaas, O. I. (2003). Teaching individuals with developmental delays: Basic intervention techniques. Pro-Ed, Austin, TX.

Lovaas, O. I., Schreibman, L., Koegel, R. I., & Rhem, R. (1971). Selective responding by autistic children to multiple sensory input. Journal of Abnormal Psychology, 77, 211-222.

Lund, S. K. (2002, May). Establishing elementary rule-governed behavior in children with autism: A proposed sequence of implementation. Paper presented at the 28^thAnnual Convention of the Association for Behavior Analysis, Toronto, Canada.

Lund, S. K., & Eisenhart, D. (2002, May). Establishing icon-based communication in children with autism and severe cognitive delay. Poster presented at the 28^th Annual Convention of the Association for Behavior Analysis, Toronto, Canada.

Markman, E. M. (1989). Categorization and naming in children: Problems of induction. The MIT Press, Cambridge Massachusetts.

McEachin, J. (2001, May). What to teach and how much is enough? Paper presented at the 27^th Annual Convention for the Association for Behavior Analysis, New Orleans, Louisiana.

Michael, J. (1985). Two kinds of verbal behavior plus a possible third. Analysis of Verbal Behavior, 3, 1-4.

Pelios, L. V., & Sucharzewski, A. (2003, May). A literature review on teaching strategies for establishing receptive language. Paper presented at the 29^th Annual Convention of the Association for Behavior Analysis, San Francisco, CA.

Powers, M. D., & Handleman, S. J. (1984). Behavioral assessment of severe developmental disabilities. Aspen System Corporation, Rockville Maryland.

Rutter, M., & Schopler, E. (1987). Autism and pervasive developmental disorders: Concepts and diagnostic issues. Journal of Autism and Developmental Disorder, 17, 159-183.

Sigman, M., & Ungerer, J. (1984). Cognitive and language skills in autistic, mentally retarded, and normal children. Developmental Psychology, 20, 293-302.

The Barnhart Concise Dictionary of Etymology: The Origins of American English Words (1995). Harper Collins Publisher. New York, NY.

Touchette, P. E., & Howard, J. S. (1984). Errorless learning: Reinforcement contingencies and stimulus control transfer in delayed prompting. Journal of Applied Behavior Analysis,17, 175-188.

Waterhouse, L., & Fein, D. (1989). Language skills in developmentally disabled children. Brain and Language, 15, 307-333.

Author’s note

The principal content of this paper was presented at the 29^th Annual Convention for the Association of Behavior Analysis, San Francisco, CA (May, 2003). Correspondence concerning this article may be addressed to Stein K. Lund via electronic mail at Stein525lund@aol.com 1 Harbor Village Drive # 1, Middletown, RI 02842.

Footnotes

Footnote 1: The term “selection-based” is adopted from Michael (1985). It is not an extension of the technical term “selection” as used within a technical discourse, but derived from its colloquial usage (see Catania, et. al., 2002 for a discussion).

Footnote 2: Several clinicians have identified imitation, matching-to-sample, auditory discrimination, scanning and pointing as prerequisite to receptive language (Leaf & McEachin, 1999, Lovaas, 2003; Pelios & Sucharzewski, 2003). Research from traditions outside the field of behavior analysis suggests that language comprehension is correlated with the ability to imitate familiar gestures (Abrahamsen & Mitchell, 1990; Sigman & Ungerer, 1984).

Footnote 3: This distinction is akin to the distinction between molar and molecular behavior (e.g., Powers & Handleman, 1984) or macro and micro behavior (e.g., McEachin, 2001). Molar or macro behavior refers to broader, more global skills such as “eating,” whereas molecular or micro skills refer to the sub-skills that constitute the molar skill (e.g., holding a spoon, drinking from a glass, sitting appropriately). The present distinction is somewhat different. It pertains to integrated behavioral units such as “receptive labeling,” “imitation,” “tacting” and abstract concepts (e.g., nominal and genitive pronouns) that cannot be divided into a chain of independent responses the same way as macro-level skills (e.g., “eating”). The term “phenomenological” as used here, is derived from the original meaning of “phenomenon” (fact, occurrence, manifestation) and should not be confused with the later meaning of “extraordinary occurrence” (The Barnhart Concise Dictionary of Etymology, 1995) or its specific meanings within philosophy (cf., Husserl, Heidegger and Sartre).

Footnote 4: Lund (2002) deconstructed behavior under control of function-altering contingency specifying stimuli and described a sequence in which elementary rule-governed behavior (i.e., “pliance”) can be established in children who fail to acquire this repertoire through natural environment teaching. In a similar manner, Lund and Eisenhart (2002) deconstructed behavior involved in picture exchange and described how this repertoire can be established in children who do not acquire it in a normative socio-communicative context such as PECS (Bondy

& Frost, 2001). In both cases the repertoires were deconstructed, subcomponents (i.e., implementation level skills) were established in isolation and eventually combined into the target repertoire (i.e., phenomenological level). Several steps were highly contrived.

Footnote 5: Terms, such as “attribute,” “extend,” “referent” and “regard” are vernacular terms with organism-based implications. These are not viewed as theoretical primitives but used for the purpose of ease of communication. A pragmatic reformulation can be provided when a situation demands it.

Designing Receptive Language Programs: Pushing the Boundaries of Research and Practice

Jan 5, 2026

Vincent LaMarca1 & Jennifer LaMarca2
Published online: 29 January 2018
© Association for Behavior Analysis International 2018

Abstract

Initial difficulty with receptive language is a stumbling block for some children with autism. Numerous strategies have been attempted over the years, and general guidelines for teaching receptive language have been published. But what to do when all else fails? This article reviews 21 strategies that have been effective for some children with autism. Although many of the strategies require further research, behavioral practitioners should consider implementation after careful review. The purpose of this article is to help behavior analysts in practice to categorize different teaching procedures for systematic review, recognize the conceptually systematic rationale behind each strategy, identify different client profiles that may make 1 strategy more effective than another, and create modifications to receptive language programming that remain grounded in research.

Keywords: Autism, Developmental disabilities, Early intervention, Instructional strategies, Listener behavior, Receptive language

Vincent LaMarca
vincel@littlestarcenter.org

Jennifer LaMarca
jennl@appliedbehaviorcenter.org

1. LittleStar ABA Therapy, 12650 Hamilton Crossing Boulevard, Carmel, IN 46032, USA
2. Applied Behavior Center for Autism, 7901 E. 88th St., Indianapolis, IN 46256, USA

“If a child cannot learn in the way we teach, we must teach in a way the child can learn”—Dr. Lovaas’s famous call to action is as relevant to behavioral practitioners today as it was in the 1990s when he uttered those words at conferences in the United States and throughout the world. The research of Lovaas (1987); the follow-up study of McEachin, Smith, and Lovaas (1993); and the replication studies of Sallows and Graupner (2005) and Cohen, Amerine-Dickens, and Smith (2006) have laid the foundation of effective early intensivebehavioral intervention (EIBI). One component ofEIBI is the acquisition of receptive language, also referred to as listener responding. An article by Grow and LeBlanc (2013) provides a set of basic implementation guidelines to follow when first beginning receptive language programming. These strategies are meant to decrease the likelihood of encountering common difficulties associated with receptive language development in children with autism, such as faulty stimulus control, over selection, and failure to attend to the stimuli. Through a careful analysis of receptive labeling procedures and with evidence from research to support their recommendations, Grow and LeBlanc (2013) established a strong foundation upon which to develop receptive language programming.

However, several strategies have been effective in helping children with autism gain receptive language that either are not captured within the guidelines or are contrary to the guidelines. Table 1 lists those strategies and the general guideline they may violate. Two articles have already been written that include a list of programming variations to use when children struggle to acquire receptive language (Chesnut, Williamson, & Morrow, 2003; Pelios & Sucharzewski, 2004). Although lists of behavioral strategies for receptive language can be helpful in informing practitioners of the wide variations available for programming, practitioners must do their part not to fall into the trap of randomly choosing a strategy or attempting to implement an uninformed shotgun approach of multiple strategies. First, careful consideration should be given to the evidence. Does a particular strategy have more or less evidence to back it up? How similar or different are the program variations under consideration to the exact procedure found in research? Second, careful consideration must be given to the rationale. What is the hypothesis for why a particular strategy will work? What are the underlying behavioral principles that make it reasonable to assume that the strategy will be effective? Third, the individual child must be assessed carefully. How is this child similar or different from the participants in the research? How or why does the rationale for this strategy fit for this particular child’s profile?

Table 1 Guidelines from grow and LeBlanc (2013)

Grow and LeBlanc (2013) guidelines	Additional strategies
Require an observing response For visual stimuli For auditory stimuli Minimize inadvertent instructor cues Eye gaze and physical movements	Sound discrimination Receptive video labeling Receptive singing labeling
Voice modulation	Voice inflection
Purposefully arrange the antecedent stimuli and required behaviors Teach distinctly different behaviors	Verb–noun combination Touch object versus hand object
Introduce multiple targets simultaneously	Simple-to-conditional discrimination Blocked trials Time expansion Two-item field
Consider interspersing mastered targets	Similar task interspersal with expansion trials
Select appropriate and concise auditory instructions	Verb–noun combination
Counterbalance visual and/or auditory stimuli Select features of the discriminative stimulus and incorrect comparison stimuli	Modes of stimuli
Use effective prompting and differential reinforcement Identify an effective prompt and fading strategy Conduct systematic preference assessments Use differential reinforcement Troubleshoot existing problems with stimulus control	Embedded discrete trial teaching Modified incidental teaching

Strategies that expand upon the guidelines are in boldface font. Those that violate the guidelines are in regular font. For example, the verb–noun combination strategy both expands upon one guideline and violates another

This article is meant to build upon the existing literature and help behavior analysts become better problem solvers when difficulties with receptive language arise. The article identifies, through a literature review, possible alternatives to teaching receptive language when general guidelines fail. In addition, it identifies strategies that have not yet been studied experimentally but that hold promise based on their underlying rationale and effectiveness with a few learners during the course of practice over the past 22 years by the authors of this study. Finally, the article identifies specific strengths and weaknesses of a child that can help practitioners determine which alternatives may be most beneficial to attempt.

Potential Strategies from an Analysis of Current Skill Level

Behavioral practitioners often pride themselves on their ability to break down complex skills into smaller prerequisite skills, teach those prerequisite skills first, and then gradually combine those skills to teach more complex skills. When difficulty with a complex skill such as receptive labeling occurs, one approach available to behavioral practitioners is to focus on smaller prerequisite skills.

A receptive labeling program is a type of receptive language skill that requires a conditional discrimination rather than a simple discrimination. A simple discrimination is a basic three-term contingency composed of a discriminative stimulus, a response, and a differential consequence for the correct response. For example, in a receptive instructions program, a simple discrimination results when (a) the therapist provides an auditory stimulus (e.g., she says “Wave”), (b) the child responds to the stimulus (e.g., the child waves), and (c) the therapist delivers reinforcement only for the correct behavior. Conditional discriminations are a more complex four-term contingency that require an additional comparison to ensure a correct response. In an auditory–visual conditional discrimination, such as in a receptive labeling program, the auditory words spoken by the therapist make, for that moment, one visual item the discriminative stimulus (SD) and the other visual items S-deltas. For example, (a) the therapist provides an auditory stimulus (e.g., she says “Elmo”) while (b) an array of visual items (e.g., a car, Elmo, and a plastic cup) are in front of the child, so that (c) for the moment, the child selects Elmo from the array and not the car or plastic cup, and (d) the therapist delivers reinforcement only for the correct behavior. The therapist should also make sure that Elmo is established as both an SD (when the therapist says “Elmo”) and an S-delta (when the therapist labels a different object while Elmo is still in the array). If the therapist always asks for Elmo when Elmo is present, then the child does not have to attend to the auditory stimulus but rather only needs to visually discriminate where Elmo is located. Behavioral practitioners should not underestimate the complexity of a conditional discrimination or discriminations in general (Sidman, 1986, 2010)—conditional discriminations can be challenging to establish. Once a discrimination is established, we often assume that the stimuli we wanted to control the behavior are, in fact, controlling the behavior (Sidman, 2008). However, there are many variables in the environment that can inadvertently control the behavior, and we may overlook their impact on what we teach. Establishing a strong foundation of prerequisite skills in a child with autism becomes important so that we can focus specifically on the conditional discriminations we wish to develop.

The assessments of Kodak et al. (2015), which built upon the Assessment of Basic Learning Abilities (Kerr et al., 1977; Sakko, Martin, Vause, Martin, & Yu, 2004), serve as a useful starting point in identifying potential prerequisite skills for receptive labeling. The authors found correlations between the ability to complete all skills in the assessment and the ability to receptively identify objects. Five prerequisite skills were identified. First, in imitation of pointing, the therapist points at a picture in an array of two and the child points at the same picture. Second, in simple visual discrimination, the child touches a picture in an array of two pictures whose position is randomly rotated. One picture results in reinforcement and the other picture does not. Third, visual–visual identity matching is a type of visual–visual conditional discrimination that is often taught through match-to-sample procedures. For this skill, a therapist hands the child a picture and the child places the picture on top of a matching picture in a field of three cards. Fourth, by scanning, the child looks at each stimulus in the array during visual–visual identity matching. Finally, in simple auditory discrimination, the child touches a white card in the presence of a sound and keeps his hands in his lap in the presence of a different sound. Failure to demonstrate one of the five skills provides direction for prerequisite programming that may benefit a child prior to attempting a typical receptive labeling program.

A couple combinations of the aforementioned skills may also be helpful prerequisites to the auditory–visual conditional discrimination required in receptive labeling. A simple auditory discrimination followed by a simple visual discrimination is more complex than either simple discrimination in isolation but is not as difficult as a conditional discrimination. An example of such a program would be having a therapist say the word go and then having a child always touch the same picture in an array of two pictures that are randomly rotated on the table. The discriminations remain simple because the spoken word is not directly related to the picture. The child must wait to respond until he hears the word go, but the word does not indicate which picture to touch. The child always touches the same picture. This is the type of discrimination present in the earlier example if a therapist always says “Elmo” when Elmo is on the table and never names another object on the table. Green (2001) and Grow and LeBlanc (2013) caution against such a procedure because it may inadvertently teach a child that he or she does not need to attend to the actual spoken word or he or she may learn not to attend to all the stimuli in the array. However, as indicated in the following sections, procedures that include this type of discrimination have been helpful in teaching some children with autism to gain receptive language, perhaps in part because they need more practice with simple discriminations. If necessary, it may be possible to allay some concerns by using arbitrary nonfunctional sounds or arbitrary nonfunctional objects while a child gains this prerequisite skill and then using actual words and functional objects in a typical receptive labeling program.

Also, although an auditory–auditory conditional discrimination skill has been demonstrated to come after auditory– visual conditional discriminations (Marion et al., 2003), an auditory–visual conditional discrimination that includes auditory identity matching may facilitate correct responding. In fact, neuropsychology research has demonstrated that auditory sounds associated with an object can facilitate recognition of that object (Kassuba, Menz, Röder, & Siebner, 2013).

In such a program, the child’s response includes a sound that is the same as the sound in the SD. For example, a therapist reaches into a bag and pushes the button on a train that makes a train whistle noise. The child has three objects that make noise in front of him (the same train, an electronic piano, and a maraca). The child pushes the button on the train. Assessing these two skills in the example formats described previously may also help identify prerequisite skills to teach.

Finally, two other prerequisite skills worth assessing are a child’s ability to respond to shortened stimulus presentations and delayed matching-to-sample tasks. An auditory stimulus is transient, and children may be more successful with receptive labels after learning to respond to other stimuli that are present for only a short period of time. In addition, because a child must scan an array of objects before responding, the amount of time before a response can occur may be longer than in a simple discrimination. Research on delayed matching-to-sample tasks in both humans (Arntzen, 2006) and animals (Lind, Enquist, & Ghirlanda, 2015) may hold answers in helping children learn to remember the auditory sound while searching for the visual stimuli.

The following strategies may be helpful for children who demonstrate difficulty with one or more of the aforementioned prerequisite skills. Table 2 lists a synopsis of which of these strategies may be helpful based on an analysis of the prerequisite skills taught in each procedure and an initial assessment of a child’s ability to demonstrate those skills.

Strategy 1: Selection-Based Imitation

The general format of the program that Lund (2004) calls selection-based imitation starts with two identical sets of pictures placed directly across from each other on a table. The therapist says “Do this” and points to one of the pictures closest to her side of the table (e.g., a picture of a house). The child points to the same picture close to his side of the table (e.g., an identical house picture closest to the child). The program progresses from pictures lined up in a field of three to a line of six, followed by varying the picture location (so the pictures are not directly across from each other but are still in a line) and then finally varying the pictures in a random pile rather than in a straight line. We are unaware of any additional research articles that evaluate selection-based imitation, and the original article is a discussion of the procedure and its theoretical underpinnings based on its success with a few learners rather than an examination of the procedure using an experimental design.

The procedure teaches several prerequisite skills, including imitation of pointing and scanning. Children who benefited from the procedure also demonstrated “impulsive” responding, immediately grabbing for stimuli on the table, which means that they probably would have failed a simple visual discrimination test. Interestingly, the article notes that picture-to-picture matching skills are typically acquired prior to implementing selection-based imitation, so one would expect visual identity matching to be a strength for the child prior to implementing this program.

Strategy 2: Simple Auditory Discrimination

A basic receptive instructions program is one way to teach a simple auditory discrimination. The assessment of Kodak et al. (2015) provides another format that may be worth pursuing. The program would require the child to touch a card (e.g., a picture of a duck) in the presence of one auditory stimulus (e.g., the sound of a duck quacking) and not in the presence of other auditory stimuli (e.g., other sounds from a Listening Lotto game). One can extrapolate to other versions of this simple discrimination to include: (a) silence versus a target sound (e.g., a duck quacking), (b) auditory sounds versus a vocal target sound (e.g., du), (c) unbroken vocal sounds (mmmmm, hhhhhhh) versus a target word (e.g., duck), and (d) other words (e.g., elephant, juice) versus the target word (e.g., duck). In all of these formats, only one card would remain on the table to touch because this is meant to be a simple auditory discrimination, not a conditional discrimination.

Whether or not such a program would be beneficial as a prerequisite for receptive language is unknown, but it demonstrates the breadth of possibilities still worth studying, both in research and in practice, just in the area of simple auditory discrimination for a child who demonstrates difficulty acquiring receptive labels.

Strategy 3: Touch Same

A common program we have conducted in the past called “touch same” often follows other visual identity matching programs. The therapist holds up a picture (e.g., a frog) for a brief period (e.g., 1 s), and the child learns to touch an identical picture in a large field size (e.g., a field of 24 cards).

Another skill that may be worth assessing in future research, the program may be helpful for children to gain the ability to respond to visual stimuli that are gradually displayed for shorter periods of time prior to learning to respond to auditory stimuli that already occur only briefly. The format of the program also includes a delayed matching-to-sample component. As the field size increases, the amount of time it takes for the child to find the correct response also increases.

Strategy 4: Order of Stimulus Presentation

Whether the therapist delivers the auditory stimulus first (e.g., “dinosaur”) or presents the visual stimuli first (e.g., placing three objects in front of the child) may affect client learning. Petursdottir and Aguilar (2016) recently conducted research to determine which delivery method was more effective for four typically developing children. They found that delivering the auditory word first followed by showing pictures on a computer screen resulted in faster acquisition. In contrast, most applied settings present the visual stimuli first (e.g., putting objects on a table) followed by delivering the auditory SD. Although it is unknown whether results would be the same for children with autism, this is a component modification that could be manipulated and tracked by practitioners working with an individual child.

Such a program may be incorporated with a variety of the strategies that follow. The success of either format may be linked to the specific deficits a child exhibits. For example,

if a child demonstrates difficulty scanning, presenting the objects first may also be used to require a type of observing response, during which the child is expected to shift his gaze to each object as it is placed on the table before the next item is presented and then the auditory SD is finally delivered. However, if the child demonstrates difficulty with simple auditory discriminations, the child could be required to engage in a differential observing response (e.g., touching a blank card) prior to the therapist repeating the SD and showing the visual stimuli (Green, 2001).

Strategy 5: Simple-to-Conditional Discrimination

The simple-to-conditional procedure is a nine-step process that ends with the conditional-only procedure. To get there, the therapist would (a) ask for “horse” with only the horse on the floor, (b) ask for “star” with only the star on the floor, (c) ask only for “horse” with both the horse and the star on the floor, (d) ask only for “star” with both the horse and the star on the floor, (e) randomly intermix “horse” and “star,” (f) ask for “Lightning McQueen” with only Lightning McQueen on the floor, (g) randomly intermix “Lightning McQueen” and “horse” with those two objects out, (h) randomly intermix “Lightning McQueen” and “star” with those two objects out, and finally (i) randomly intermix asking for the horse, star, and Lightning McQueen with all three objects on the floor (Lovaas, 2003).

Although multiple recent studies have indicated an advantage to using the conditional-only method that immediately starts at Step 9 (Grow, Carr, Kodak, Jostad, & Kisamore, 2011; Grow, Kodak, & Carr, 2014; Holmes, Eikeseth, & Schulze, 2015; Vedora & Grandelski, 2015), there are some limitations to the research. In particular, children were not initially assessed to determine whether or not they could already respond correctly to simple auditory discriminations and simple visual discriminations. In fact, many of the children immediately responded correctly to steps with simple auditory discriminations (e.g., Steps 1, 2, and 6) and responded within one or two sessions to steps with a simple auditory discrimination followed by a simple visual discrimination (e.g., Steps 3 and 4). However, for the few who demonstrated difficulty with the initial steps of simple discrimination, mastery often occurred faster or nearly as quickly in the simple-to-conditional procedure as the conditional-only strategy.

Strategy 6: Blocked Trials

In blocked trials, the therapist delivers one SD (e.g., “Nemo”) in a field size of only two. The therapist repeats the same SD for a block of trials such as 10 trials. The therapist then switches to the other SD (e.g., “pizza”) for a second set of blocked trials. Based on meeting specific mastery criteria, blocks of trials are gradually decreased and SDs are randomly intermixed.

Blocked trials have a long history of success in teaching receptive labels to some individuals (Kodak et al., 2015; Pérez-González & Williams, 2002; Saunders & Spradlin, 1989). Pérez-González and Williams (2002) successfully used the procedure to teach receptive object labels to three children with autism who had already demonstrated difficulty acquiring the skill. Their procedure consisted of six steps:

Blocksof10trials were carried out with objects remaining in the same location.
Blocks of five trials were carried out with objects still remaining in the same location.
Blocks of two or three trials were carried out with objects still remaining in the same location.
The two object names were randomly intermixed with objects still remaining in the same location.
The two object names were randomly intermixed with objects in the opposite location.
The object names were randomly intermixed and the object location was randomly chosen.

Interestingly, the researchers did not find the procedure problematic for the reasons one might typically associate with this strategy (i.e., faulty stimulus control created based on the location of the object or matching by exclusion based on only two objects present in the field).

As with simple-to-conditional discrimination, repeating one label prior to changing to a different label in blocked trials sets up a simple discrimination that may be easier for the child to learn prior to learning a conditional discrimination. Research indicates that too frequent or too few reversals in a conditional discrimination can hinder acquisition (Saunders & Spradlin, 1989). Blocked trials alleviate this concern by systematically programming the reversals. In addition, blocked trials may be helpful for a child who demonstrates prompt dependency with physical and gestural prompts because the child has the opportunity to learn to correct his error by switching to the only other available object rather than through other forms of prompting.

Strategy 7: Sound Discrimination

The sound discrimination program progresses through a series of steps so that a child responds to the sound of an object by making the same sound. For example, the therapist shakes a rattle behind a barrier so that the child cannot see which object is making the sound. The child then selects the correct object (e.g., a rattle) from a field of three objects (e.g., bells, a rattle, and a drum) and shakes the rattle to make the same sound.

Eikeseth and Hayward (2009) successfully taught this skill to children who had not been able to acquire receptive labels.

Further, children demonstrated transfer in responding from the auditory sounds to the actual words. After teaching the child to discriminate between the sounds of two different objects, the vocal word was added as part of the SD and the sound was gradually faded for those two objects so that eventually the child picked up the rattle to shake it when hearing the word rattle and beat on the drum when hearing the word drum.

Responding to a variety of auditory stimuli may be easier to learn first before learning to respond to vocal stimuli (i.e., spoken words), which are a subset of auditory stimuli and have many more features in common than other auditory sounds (Eikeseth & Hayward, 2009). A whistle blowing, bells ringing, and a drum banging sound much different than the words whistle, bells, and drum. In addition, having the auditory sound occur in both the initial stimulus and the response may also create an easier auditory–visual conditional discrimination to acquire because it includes auditory matching. Because the program requires the child to make sounds with objects, the ability to manipulate objects, often already practiced in EIBI through some form of object imitation, should be a strength of the child.

Appendix 1 outlines a series of modifications to the sound discrimination program to help a child gradually switch from a broader auditory stimulus (e.g., the sound of a drum) that requires an auditory response(e.g., banging an identical drum) to a vocal stimulus (e.g., the word pizza) that leads to a non-vocal response (e.g., touching a toy pizza). For example, one variation changes the SD to a vocal sound that sounds similar to the object (e.g., making a high-pitched “ding-ding-ding” vs. a low-pitched “bum-bum-bum” sound for bells and a drum, respectively) as a potential next step in generalization from object sounds to vocal sounds. As another step, an app such as Speak All (Boesch, Wendt, Subramanian, & Hsu, 2013) can be used to record the therapist’s voice on one iPad (e.g., saying “rocket”) while the learner’s iPad can be programmed to repeat the therapist’s voice when the child touches the picture of the rocket. These and other variations of the sound discrimination program deserve further study to determine their usefulness as intermediate steps in the acquisition of receptive labeling.

Strategy 8: Receptive Video Labeling

In receptive video labeling, the therapist plays a short video clip from a movie or TV show (such as part of the Mickey Mouse Clubhouse theme song) and then pauses the video. The child does not see the video. The child then picks up the correct character associated with the movie from a field of three characters and is allowed to watch the remainder of the video clip.

The receptive video labeling program was conducted successfully with two children with autism by the authors of this article. The program includes elements of the stimulus specific reinforcement strategy discussed later. Future research into the effectiveness of this program and the necessary components to make it effective would be beneficial.

In general, the program is an auditory–visual conditional discrimination just like receptive labels, but the auditory SD is an excerpt from a video rather than a word. Children who already watch a variety of television shows or movies may benefit from this program because of either familiarity with or motivation for those videos. The program was first considered because the parents of one child who demonstrated difficulty acquiring receptive language noted that he always came running into the family room from anywhere in the house if the opening song from the Disney movie Cars was played on the television. In addition, the length of a video is longer, potentially eliminating the shortened stimuli presentation.

Strategy 9: Receptive Singing Label

One version of the receptive singing label program involves the therapist singing an object label to a specific tune (e.g., “fire engine” to the two words in the tune “London Bridges” or “blanket, blanket” to the two words in the tune “twinkle, twinkle” as in “Twinkle, Twinkle, Little Star”). The child then hands the correct object to the therapist.

This singing program has been used successfully with three children with autism by the authors of this study. Simpson and Keen (2010) found that the use of song facilitated receptive labeling. However, there was little generalization when the music was not present. A follow-up study found that singing the SD in a receptive labeling task led to greater engagement and learning than in the spoken condition (Simpson, Keen, & Lamb, 2013). Computer-based software delivered the SD to the tune of “Old McDonald,” ending with one of five animal names. The child moved the computer mouse to the correct animal and clicked on it.

A vocal response that includes additional auditory cues is key to the receptive singing label program. Sung words may be easier to discriminate than spoken words, with variations in rhythm, melody, and tone. Although this program violates the guidelines established by Grow and LeBlanc (2013) to stay away from voice modulation, it is important to recognize that the purpose of the receptive singing label program is again to establish an introductory level of auditory discrimination as a prerequisite skill. Its purpose is not, in fact, to teach the receptive labeling of objects using only words. Interestingly, however, two of the children with whom the authors worked were able to successfully transfer the skill and respond to the spoken label alone when the song was faded. Children for whom this program may be especially beneficial are those for whom music is a strong reinforcer (e.g., musical sounds, musical instruments, or songs in general).

Strategy 10: Voice Inflection

Emphasizing different parts of the actual label (e.g., “Darth Vader” in a low, slow voice vs. “Puppy!” in a high-pitched voice) is the essence of a voice inflection program. In a recent study, Simpson, Keen, and Lamb (2015) found that there was little difference between sung words and spoken words when an elevated pitch was used in each. They noted that some research indicates that children with autism respond better to linguistic and musical pitch, which may be the relevant feature in the intervention.

Using voice modulation is not recommended by Grow and LeBlanc (2013) because of the possibility that the child will over select on the way in which the word is said rather than select the word itself. However, for a child who is simply learning to focus on the auditory sound, using voice inflection may be appropriate. Acquiring a few labels with voice inflection may be a gradual step toward more subtle vocal discriminations. In this case, it would be better to consider the voice inflection auditory sound as the actual target rather than as a prompt to be faded. The goal is to obtain a basic discrimination between a high pitch and a low pitch— and if they cannot be faded, then include another object as a high pitch said slowly and another object as a low pitch said slowly and continue to include different pitch and duration variations. To mediate the concern of having to spend an exorbitant amount of time in the future programming appropriate stimulus control, behavior practitioners should consider using arbitrary objects or only a small subset of items.

Strategy 11: Response Delay

Dyer, Christian, and Luce (1982) created a program in which the therapist labels an object (e.g., “Spiderman”) with three objects in front of the child. The therapist waits 3 s and then signals for the child to respond (e.g., holding down the child’s hands to prevent him or her from responding sooner). The child must wait until his or her hands are released before pointing to the correct object.

Lamela and Tincani (2012) extended the research in wait times by comparing a brief wait time (approximately 1 s) with a longer wait time (approximately 4 s) in two children with autism who demonstrated off-task behavior during one-on one therapy. Their results indicated that the brief wait time led to more correct responding, which is comparable to the findings of one study (Tincani & Crozier, 2008) and contrary to the findings of other studies, two of which focused on receptive language development (Dyer et al., 1982; Valcante, Roberson, Reid, & Wolking, 1989). It appears that the appropriate wait time is a balance between allowing enough time for a child to stop engaging in other off-task behaviors and attend to the relevant cues and being short enough to keep the child attending to the task without engaging in other inappropriate behaviors.

The response delay program may enhance one skill identified in the study by Kodak et al. (2015): attending to the task by scanning. Such a program may be incorporated with many of the strategies we have already discussed. However, it may also be appropriate for children who have met all other prerequisite skills for auditory–visual conditional discriminations but who still have a tendency to engage in off-task behavior during therapy, look away from the materials in front of them, or engage in impulsive behavior and immediately grab for the objects in front of them even before the SD is delivered. The program can also be modified to focus on delayed matching to sample by not allowing the child to see the objects in the array until after a predetermined length of time after delivering the auditory SD.

Potential Strategies from an Analysis of Program Implementation

If a child demonstrates all of the prerequisite skills for auditory–visual conditional discriminations but still demonstrates difficulty with receptive labeling programs, another source of information to help determine which variations of conditional discrimination programs may be helpful is observations of the child’s performance in other programs. The relative ease that accompanies learning other skills when using specific treatment techniques may transfer to other programs. Table 3 summarizes the types of behaviors that may have already been observed in other early intervention programming. The strategies discussed in the following sections may be effective for an individual child based on his or her demonstrated weaknesses and strengths.

Strategy 12: Similar Task Interspersal with Expansion Trials

Three forms of task interspersal have been clearly outlined by Volkert, Lerman, Trosclair, Addison, and Kodak (2008), including similar task interspersal. Expansion trials include the systematic increase of time or demands between when a current target SD is delivered and the next time it is delivered. In similar task interspersal with expansion trials, one SD is the target (e.g., “Touch the boat”) and other acquired SDs from the same program (e.g., “Touch the cake,” “Touch Thomas the Tank Engine”) are gradually included. Thus, the SD sequence of “Touch the boat,” “Touch the cake,” and then “Touch the boat” would be considered an expansion of one because one acquired SD was interspersed between the target SD. The SD sequence of “boat–cake–Thomas–Thomas–boat” would be considered an expansion of three because three acquired SDs were interspersed between the target SD (“boat”).

Table 3 Beneficial strategies based on strengths and weaknesses a child demonstrates in earlier early intensive
behavioral intervention programming

Strategy	Strength or weakness
Similar task interspersal with expansion trials	W = frequent maintenance trials required
Time expansion	W = a large number of trials necessary to learn new skill
Touch object versus hand object	W = double responding/scrolling
Embedded discrete trial teaching	W = limited contrived reinforcers
Verb–noun combination	W = need for radically different responses in other programs S = object imitation; receptive instructions
Modified incidental teaching	S = requesting through incidental teaching
Two-item field	S = effective error correction procedure
Modes of stimuli	S = faster acquisition with one type of stimulus

W weakness, S Strength

Grow and LeBlanc (2013) recommends that behavior analysts use task interspersal in the form of either similar task interspersal or dissimilar task interspersal when teaching receptive labels. Smith (1994) demonstrated that children were more likely to retain skills with such a systematic approach to task interspersal compared to a mass-trial condition. Further research should compare a systematic expansion trial approach with a more random interspersal of trials, a systematic expansion approach with similar responses versus dissimilar responses, and the number of expansion trials necessary for most children to maintain a skill from one day to the next.

Because the procedure gradually and systematically increases the amount of time or work between when newer skills are practiced, it may be particularly helpful for a child who has difficulty maintaining skills once they are acquired.

Strategy 13: Time Expansion

In this program, the therapist delivers the SD (e.g., “tambourine”) with a tambourine located in one corner of the room. The therapist prompts the first response (e.g., walking to the tambourine and shaking it), waits for 5 s, and redelivers the SD. If the child is successful, the therapist continues to systematically increase the time before redelivering the SD (e.g., 10 s, 30 s, 1 min, 2 min, 5 min, 10 min, 20 min, 30 min, 1 h, 2 h, 3 h, 6 h, overnight). If the child responds incorrectly, the therapist decreases the time to the previous level. Between the 5-s and 5-min time period, the therapist engages the child in other preferred activities. From the 5-min time period forward, the therapist continues with other programs. Once one target has been mastered, a second target is introduced the next day. The goal is for the learner to acquire a label in 1 day and recall it the next morning. Once a child has acquired two labels, each in a day, those two labels are randomly intermixed.

Another area for future research, this teaching procedure was used successfully with one child by the authors of this article but was unsuccessful with another child. The procedure is similar to distributed trial instructions in which breaks of 5 s to a few minutes occur between trials (Majdalany, Wilder, Greif, Mathisen, & Saini, 2014).

Time expansion balances the frequency with which a skill is practiced with the amount of time that passes between trials. It also incorporates a dissimilar task interspersal procedure so that the skill is interspersed with all other skills that are practiced throughout the day. The procedure may be helpful for children who need a large number of trials to learn the skill and therefore may benefit from a more systematic increase in the frequency of practice.

Strategy 14: Touch Object Versus Hand Object

Behavior practitioners should consider the topography of the child’s response in receptive labeling. One format is to have the child touch an object. A second format is to have the child hand an object to the therapist. A third format is to require the child to stand up, walk to the object, and then either touch the object or bring it back to the therapist.

Booth (1978) noted in his research of receptive object identification that children with disabilities responded best when they were required to hand objects to the therapist. The response associated with picking up an object and handing it to the therapist (or placing it in a container) makes it more difficult to give multiple responses (e.g., pointing to one object and then immediately pointing to another). Other behaviors a child exhibits (e.g., the likelihood the child will throw an object or elope) may also make one response format more effective than the other.

Strategy 15: Embedded Discrete Trial Teaching

Embedded discrete trial teaching incorporates motivation and natural reinforcers within the teaching format. For example, one child may jump to the correct picture based on his preference for a jumping game. Other response format variations might include using a pointer to point to the correct response, shining a flashlight on the correct response, dropping the correct response in water, or slapping the correct picture with a flyswatter. A list of 25 different response formats, originally posted to the Me-List listserv in 1997, is included in Appendix 2.

Geiger et al. (2012) demonstrated that the strategy was more efficient than traditional discrete trials in teaching receptive labeling. A child’s particular preferences become important in the selection of a response format, and a variety of strategies are available to help determine the child’s preference for particular activities (Reid, DiCarlo, Schepis, Hawkins, & Stricklin, 2003).

Such a strategy attempts to increase the motivation associated with the response to help maintain the child’s attention. It is also helpful for children who do not respond to other typical forms of contrived reinforcement. At the same time, it is important to ensure that the response does not add too much undue complexity to the child’s basic discrimination task.

Strategy 16: Verb–Noun Combination

A verb–noun combination requires a response that includes both a discrete action and an object (e.g., “Push car,” “Wave flag,” “Blow bubbles”). The same action is always conducted with the same object.

Curiel, Sainato, and Goldstein (2016) implemented a matrix training procedure with one child in which five actions, each with a different object, were first taught, and then therapists probed for generalization to other combinations of the same actions and objects (e.g., “Push flag” and “Wave car”). Although the child in the study also had limited receptive language skills and did not respond to receptive instructions, the focus of the verb–noun program is not to probe for generalization, which would include a more complex conditional discrimination. Instead, the focus is to teach initial auditory– visual discriminations with objects and actions that are as radically different from each other as possible, rather than always responding with the same action (e.g., touching objects). In our experience, other indications that this format should be attempted is if a child has been successful in object imitation programs and has already acquired a variety of simple receptive instructions.

Strategy 17: Modified Incidental Teaching

In this program, the therapist brings the child to an area associated with a preferred activity (e.g., into the kitchen with items on the counter) and asks if he is ready to make a snack. When the child indicates that he is ready for a snack (e.g., pointing, nodding his head, using augmentative communication, or saying “Yes”), the therapist asks for the items needed to make the snack in random order. Once all items are successfully identified receptively (with prompts if necessary), the child is allowed to make the snack.

McGee, Krantz, Mason, and McClannahan (1983) created this strategy by combining elements of both incidental teaching and discrete trial teaching. Increased motivation in the program may increase a child’s attention to the objects as well as ensure that a high level of reinforcement is delivered. This strategy may be beneficial for learners who have shown rapid development in other skill areas when an incidental teaching approach was used (e.g., in requesting, imitation, or play).

Strategy 18: Two-Item Field

A two-item field in which only two objects are placed in the array is a common format in blocked trials and can teach a basic problem-solving strategy of trying something different (i.e., changing answers) if the first response is incorrect.

The preferred field size suggested by both Grow and LeBlanc (2013) and Green (2001) is a three-item field because it decreases the likelihood that the child will respond correctly by chance. However, Leaf, Sheldon, and Sherman (2010) used a no-no prompt strategy with a two-item field to successfully teach receptive labels to three children with autism. During the program, the therapist delivers an SD (e.g., “garbage truck”) with two objects on the table (a garbage truck and a Lego). If the child responds incorrectly, the therapist says “No” in a neutral tone and repeats the SD. If the child responds incorrectly again, the therapist says “No” again and delivers the SD a third time while also delivering the least intrusive prompt that is effective for the child.

As discussed in blocked trials, the format allows other forms of prompting to be faded and sets up a situation in which learning occurs through differential reinforcement to all responses. The format can initially be attempted in an easier program such as visual identity matching. Responding to such differential reinforcement is key to the strategy. One must be cautious of children who do not find enough differential reinforcement associated with immediately responding correctly instead just randomly choosing an object and, if the response is incorrect, choosing the other object.

Strategy 19: Modes of Stimuli

Four common modes of stimuli include objects (e.g., a My Little Pony figurine), pictures (e.g., a picture of a tiger), other people (e.g., the child points to the therapist’s nose), and the child himself (e.g., the child points to his own feet). Different learners may attend better or be more motivated by different modes of stimuli.

There is no current research comparing the acquisition rate of receptive language with different modes of stimuli. However, Pérez-González, Cereijo-Blanco, and Carnerero (2014) found that in the procedures in a study of tacts they implemented, children showed more emergence of novel skills with objects in comparison to pictures, demonstrating that the mode of stimuli can matter in the development of some skills.

Practitioners may be able to determine which stimuli are likely to be more effective by evaluating the child’s acquisition rate on other tasks that use different modes of stimuli (e.g., matching pictures vs. matching objects) or by allowing the child to choose between program formats. Whenever stimuli are used that may be motivating, one must also be cautious that the items are not so motivating that the child is always grabbing for the items just to gain access to them.

Potential Strategies from an Analysis of Equivalence Class Formations

The use of equivalence classes to, in a sense, work around a child’s difficulty with receptive language is one final strategy to consider. A large body of research exists concerning equivalence class formation in individuals with autism (McLay, Sutherland, Church, & Tyler-Merrick, 2013). Results are mixed, with equivalence classes emerging for some individuals but not for others. The following strategies may be worth attempting with children who already demonstrate equivalence class formation with visual identity matching tasks. Strategy 20: Audio-Specific Consequences

With audio-specific consequences, a child is initially handed an ambulance to match to a picture of the ambulance in an array of three or four pictures. When the child matches the object to the picture, the child is then given an edible as reinforcement and an ambulance sound is played at the same time. The child learns to match four objects to pictures in this format (e.g., after matching a stuffed lion to a picture of a lion, the sound of a lion is played during reinforcement; after matching a spaceship to a picture of a spaceship, the sound of a spaceship blasting off is played during reinforcement). In Phase 2 of the program, the sound is delivered as the stimuli (e.g., the sound of the ambulance) and the child selects the correct picture (e.g., in an array of the ambulance, lion, rooster, and spaceship), hopefully without the need for additional teaching.

Varella and de Souza (2014) demonstrated the emergence of auditory–visual relations when a specific sound was presented as part of the consequence for each stimulus. Although Varella and de Souza’s results were promising, the four 7- to 15-year-old children with autism in the research already had some receptive language skills, although only in the range of those of a 3-year-old.

The procedure and its rationale have long been studied in both animals and humans (Dube, McIlvane, Mackay, & Stoddard, 1987; Zaine, Domeniconi, & de Rose, 2014). If a specific reinforcer is used for each comparison stimuli in a conditional discrimination procedure, the reinforcer may become part of the equivalence class, and equivalence relations that include the reinforcer may emerge without deliberate teaching.

Strategy 21: Stimulus-Specific Reinforcement

Rather than deliver the same or random reinforcers for correct responding, stimulus-specific reinforcement always delivers one specific reinforcer for each specific response (e.g., a cookie is given for correct responding to “spoon” and M&Ms are given for correct responding to “Buzz Lightyear”).

Litt and Schreibman (1981) initially published a study demonstrating the value of stimulus-specific reinforcement in learning receptive labeling discriminations. However, Chong and Carr (2010) did not replicate the results. However, the children in the study conducted by Litt and Schreibman (1981) were all non-vocal, whereas the children in the study conducted by Chong and Carr (2010) were categorized as advanced vocal learners.

Stimulus-specific reinforcement is potentially the most puzzling strategy in this article. There are multiple theories behind how the procedure works (Goeters, Blakely, & Poling, 1992; Urcuioli, 2005). Goeters et al. (1992) boldly stated that it does not matter if we know why it works—the fact that it works is reason enough to use it. However, Chong and Carr (2010) noted that the procedure is consistently successful with animals but has mixed results in human populations.

Conclusion

The 21 strategies included in this article are not meant to be an exhaustive list of alternative strategies for teaching receptive language. For example, many of the strategies can be used together to create additional alternatives. By the time all options have been exhausted, there are literally hundreds of different potential combinations. Also, other alternatives have been suggested, such as teaching expressive language in the form of tacts or mands first (Pelios & Sucharzewski, 2004) or focusing on joint attention skills (Yoder, Watson, & Lambert, 2015). While research continues to assess each strategy, this list is meant to serve as an additional resource upon which behavior analysts can continue to build based on current research and practice, conceptually systematic rationale, and individual child profiles.

The strategies are also not meant to replace the general guidelines of Grow and LeBlanc (2013). Many of the suggested strategies require additional research. Many of the strategies violate at least one of the general guidelines. But the fact remains that some children with autism continue to demonstrate difficulty with receptive labeling when general guidelines are followed, and some children with autism do make progress with the aforementioned strategies. A cursory review of the research studies in this article that included data on the number of sessions to mastery indicated that children acquired an initial discrimination in receptive labeling for the first two to three items, typically within nine to 14 sessions. That isaround2 weeks in most EIBI programs. If something is not working, what assessments are behavioral practitioners conducting, and what changes are occurring? A child who cannot learn in the way we teach is depending on us to find a way to help him or her learn.

In our rush to find what will work, we must remain careful not to just find what is different. At a minimum, when a child is demonstrating difficulties gaining receptive language, behavioranalysts should critically review the progress ofa learner and make ongoing changes to standard programming based on data. Insights from applied behavior analysis will grow most rapidly and accurately when there is a symbiotic relationship between behavior analysts in research and behavior analysts in practice. In research, rigorous, narrowly controlled experiments test the validity of what we do. Research keeps us grounded in evidence-based practice. But it is impossible to study all of the decisions behavior analysts make on a daily, weekly, and monthly basis. In fact, some of those decisions become the spark for future research. When working with children with autism, insights from research without insights from practice become lethargic. Insights from practice without insights from research become impulsive. Insights from research and practice together become transformative. Behavioral practitioners hold themselves accountable to that ideal. Behavioral practitioners never settle.

Fig. 1 Progression from sound discrimination to typical receptive labeling. AUD = auditory sound; VOCS = vocal sound; VOCW = vocal word

Appendix 1

Sound Discrimination Variations

As a final product, receptive labeling begins with a vocal stimulus (e.g., the word pizza) that results in a non-vocal response (e.g., touching a toy pizza). This is indicated at the bottom of Fig. 1 by the program with a thick black border. The sound discrimination program begins with a broader auditory stimulus (e.g.,thesoundofa drum) that requires the child to produce the same auditory response (e.g., banging an identical drum). This is indicated near the top of Fig. 1 by the program with a thick dotted border. Overall, the figure illustrates through a basic flowchart the ways in which variations to the sound discrimination program can be made, gradually teaching skills through changes to the discriminative stimulus (SD) or the response that ultimately arrive at receptive labeling. Sounds can progress from auditory sounds (e.g., a horn, a bell, a train whistle) to vocal sounds (e.g., consonant sounds, vowel sounds, or even a raspberry noise) to vocal words. These combinations are indicated by thicker arrows that curve vertically in the figure. Combinations can also change from auditory stimuli and auditory responses that are identical (column 1) to other formats with auditory stimuli and auditory responses that are identical (column 2) to auditory stimuli and auditory responses that are not identical (column 3) to auditory stimuli and responses with no auditory stimuli (column 4). These combinations are indicated by thinner arrows that point in a horizontal direction. All variations should be studied further to determine if learning these skills will facilitate the development of receptive labeling. Programming should attempt to progress as quickly as possible to a standard receptive labeling program but could focus on related skills when a child demonstrates difficulty with changes to the SD or response.

Variation 1: Auditory SD—Auditory Response

SD: The therapist shakes a bell from behind a barrier with three objects in front of the child (the same bell, a rice shaker, and a drum).

Response: The child picks up the bell and shakes it.

Variation 2: Two Auditory SDs—Two Auditory

Responses

SD: The therapist shakes a rice shaker and then pounds a drum from behind a barrier with three objects in front of the child (a bell, a rice shaker, and a drum).

Response: The child picks up the rice shaker and shakes it, then picks up the drum and pounds it.

Variation 3: New Auditory SD—Test Auditory Response

SD: The therapist plays a novel instrument from behind a barrier (e.g., a squeeze horn) with three novel objects that make sounds in front of the child (the same horn, a xylophone, and a piano).

Response: The child plays each of the instruments and places the correct instrument on a plate or in a container to indicate that that is the one the child has chosen as making the same sound.

Variation 4: Nonidentical Auditory SD—Auditory Response

SD: The therapist shakes a rice shaker that sounds similar to but is not the same rice shaker the child has. For example, the therapist could shake a larger rice shaker and play rice shaker sounds from an iPad that sound similar. Three objects are in front of the child (a bell, a rice shaker, and a drum).

Response: The child picks up the rice shaker and shakes it.

Variation 5: Vocal Sound SD—Auditory Response

SD: The therapist makes a vocal sound such as “shhhhhhh.” Three objects are in front of the child (a bell, a rice shaker, and a drum).

Response: The child picks up the rice shaker and shakes it. The therapist initially shakes the rice shaker while making the vocal sound and then gradually fades the rice shaker sound.

Variation 6: Vocal Word SD—Auditory Response

SD: The therapist says “rice shaker.” Three objects are in front of the child (a bell, a rice shaker, and a drum).

Response: The child picks up the rice shaker and shakes it. The therapist initially shakes the rice shaker while saying the word or initially includes the vocal sound (e.g., “shhhhhhhaker”), depending on which formats were previously mastered.

Variation 7: Auditory SD—No Auditory Response

SD: The therapist shakes a rice shaker from behind a barrier with three objects in front of the child (a bell, a rice shaker, and a drum).

Response: The child points to the correct object rather than making the sound, or the inside of the rice shaker is removed so that when the child picks it up and hands it to the therapist, it does not make a sound.

Variation 8: Vocal Sound SD—Vocal Sound Response

SD: The therapist makes a vocal sound such as “mmmmmm” or “shhhhhhh” or a blowing sound, or the therapist presses an icon on an iPad that has a prerecorded vocal sound associated with it.

Response: The child touches his own iPad with the picture that makes the same sound (e.g., “mmmmm” for a food the child really likes; “shhhhh” for a rocketship blasting off into space; a blowing sound for a canister of bubbles).

Through apps such as SpeakAll, the iPad can be used to record the therapist’s voice and the child’s iPad can be programmed to repeat the therapist’s voice when the child touches a picture.

Variation 9: Two Vocal Sound SDs—Two Vocal Sound Responses

SD: The therapist makes the vocal sound “mmmmmm,” pauses for 1 s, and then makes a blowing sound.

Response: The child touches his own iPad with the two pictures in order of the same sounds (e.g., “mmmmm” for a food the child really likes and then a blowing sound for a canister of bubbles).

Variation 10: New Vocal Sound SD—Test Vocal Sound Response

SD: The therapist plays a novel vocal sound on his iPad.

Response: The child presses new pictures on his iPad until he finds the correct sound and then stops pressing buttons.

Variation 11: Nonidentical Vocal Sound SD—Vocal Sound Response

SD: Different therapists make the same vocal sound such as “mmmmmm” or “shhhhhhh” but vary the pitch, volume, or inflection.

Response: The child touches his own iPad with the picture that makes the original sound.

Variation 12: Vocal Word SD—Vocal Sound Response

SD: The therapist says “rocket.”

Response: The child touches his own iPad with the picture that makes the vocal sound (e.g., pressing the rocket results in the sound “rrrrrrrrrr”).

This program may work best when the vocal sound that was originally used is part of the vocal word. Vocal sounds can either sound like the auditory sound of an object (e.g., “shhhhhhh” may sound similar to the sound a rocket makes) or like part of the word (e.g., “rrrrr” is the initial sound of “rocket”). Which type of vocal sound is used will partly depend on the expected progression through the sound discrimination program (e.g., Is the focus on transferring from auditory sounds to vocal sounds or from vocal sounds to vocal words?) and may also include a transition from one vocal sound to the other.

Variation 13: Vocal Sound SD—No Vocal Sound Response

SD: The therapist makes a vocal sound such as “mmmmmm” or “shhhhhhh” or a blowing sound, or the therapist presses an icon on an iPad that has a prerecorded vocal sound associated with it.

Response: The child touches a picture, an object, or a picture on his own iPad or on the table. No sound is made during the response.

Variation 14: Vocal Word SD—Vocal Word Response

SD: The therapist names an object (e.g., Woody), or the therapist presses an icon on an iPad that has a prerecorded vocal word associated with it.

Response: The child touches his own iPad with the same picture that says the same word (e.g., the icon of Woody says “Woody” when touched).

Variation 15: Two Vocal Word SDs—Two Vocal Word Responses

SD: The therapist says “Woody” and then says “train.”

Response: The child touches the two pictures on his own iPad in the same order (e.g., “Woody” results in the word Woody repeated by his iPad and then “train” results in the word train repeated on his iPad).

Variation 16: New Vocal Word SD—Test Vocal Word Response

SD: The therapist plays a novel word on his iPad (e.g., Buzz Lightyear).

Response: The child presses new pictures on his iPad until he finds the one that matches the same word and then stops pressing buttons.

Variation 17: Nonidentical Vocal Word SD—Vocal Word Response

SD: Different therapists say the same word (Woody) but vary the pitch, volume, or inflection.

Response: The child touches his own iPad with the picture that makes the original sound.

Variation 18: Vocal Word SD—No Vocal Word Response

SD: The therapist says “Woody” with three objects on the table.

Response: The child touches the correct object. This is a typical receptive labels program.

Appendix 2

Receptive Language Program Format Variation Posted to the Me-List (1997)

The following is a list of possible formats for receptive programs. Some are messy and require laminating the cards.

Give the child a flyswatter. SD: “Slap __.”
Place cards or objects in clear plastic shoeholders that can be hung vertically. Give the child a ruler. SD: “Point to __.”
Place cards or objects on the floor. Give the child a beanbag. SD: “Throw onto __.”
Place cards in the gripping end of wooden clothespins and stand the clothespins up like bowling pins. You could also try using plastic tripod paper clips. Give the child a party blower that unrolls when inflated, a disc shooter, or a water gun, or have the child use his hand. SD: “Knock down __.”
Give the child a flashlight or a laser pen. SD: “Shine on __.”

Give the child a feather duster. SD: “Dust __.”
Give the child bingo chips or small figurines. SD: “Put on __.”
Make extra large flash cards (8 × 10) and place them on the floor. SD: “Step [or jump] on __.”
Use a chalkboard inside or sidewalk chalk outside with targets and distractors written or drawn on the surface. SD: “Spray __.”
Use regular flash cards or items but in sets, moving closer to reinforcers (like “Mother May I?”; i.e., 6 f. from trampoline: first set, 4 f. away: second set, 2 f. away: third set, then go on the trampoline), going up stairs, out the door, and so on.
Hang cards on a clothesline or on a hanger with a clothespin. SD: “Pull [or take] off __.”
Use tactile materials such as shaving cream, sand, finger paint, or Play-Doh for graphic motor. SD: “Write __ (with your finger).”
Tape small pictures or stickers onto a large rolling pin or plastic soda bottle. Have the child slowly roll the pin to find the target and then point it out. SD: “Find __.”
Hang a magnetic board on the wall. Attach pictures with magnetic clips. SD: “Pull [or take] off __.”
Give the child a favorite stuffed animal, figure, or puppet to “” SD: “Feed Elmo __.”
Give the child a “magic wand.” You can buy ones that light up at some toy stores or make your own. SD: “Touch [or tap] __.”
Hide cards or objects in a container filled with rice or sand for the child to dig up. SD: “Give me __.”
Reverse the previous activity. Place cards in front of the container. SD: “Bury __.”
Attach paper clips to the flashcards you are using and then spread the cards out on the floor. Give the child a stick with the string tied on the end and a magnet tied to the end of the string to “catch” the target. SD: “Catch __.”
Tape pictures, flashcards, and so on to balloons. Give the child something that he can safely pop the balloons with. SD: “Pop ___.” “Poke ___.”
Tape each picture, flashcard, and so on to a flower shape you have cut out of colorful construction paper. Tape the shape to a tongue depressor. Go out to the garden (or find a pot if you can’t go outside) and plant them. SD: “Plant ___.”
Reverse the previous process and start with all of the flowers planted. Have the child pick the target flower. SD: “Pick ___.”
Give the child a stamper to stamp the target with. Use a baby wipe to clean off cards between trials, or just use a stamper without any ink on it. SD: “Stamp __.”
Give the child a paintbrush and a cup of water. SD: “Paint on __.”
Stick small lumps of clay on the tops of toy cars. Stick the cards vertically in the lumps of clay so that the cards stand up on the car roofs. SD: “Push __.”

References

Arntzen, E. (2006). Delayed matching to sample: probability of responding in accord with equivalence as a function of different delays. The Psychological Record, 56, 135–167.

Boesch, M. C., Wendt, O., Subramanian, A., & Hsu, N. (2013). Comparative efficacy of the picture exchange communication system(PECS) versus a speech-generating device: effects on requesting skills. Research in Autism Spectrum Disorders, 7, 480–493. https://doi.org/10.1016/j.rasd.2012.12.002

Booth, T. (1978). Early receptive language training for the severely and profoundly retarded. Language, Speech, and Hearing Services in Schools, 9, 151–154. https://doi.org/10.1044/0161-1461.0903.151.

Chesnut, M., Williamson,P. N., & Morrow, J. E. (2003).The use of visual cues to teach receptive skills to children with severe auditory discrimination deficits. The Behavior Analyst Today, 4, 212–224. https://doi.org/10.1037/h0100120.

Chong, I. M., & Carr, J. E. (2010). Failure to demonstrate the differential outcomes effect in children with autism. Behavioral Interventions, 25, 339–348. https://doi.org/10.1002/bin.318.

Cohen, H., Amerine-Dickens, M., & Smith, T. (2006). Early intensive behavioral treatment: replication of the UCLA model in a community setting. Journal of Developmental and Behavioral Pediatrics, 27(Suppl. 2), S145–S155. https://doi.org/10.1097/00004703200604002-00013

Curiel, E. S., Sainato, D. M., & Goldstein, H. (2016). Matrix training of receptive language skills with a toddler with autism spectrum disorder: a case study. Education and Treatment of Children, 39, 95–109. Dube, W. V., McIlvane, W. J., Mackay, H. A., & Stoddard, L. T. (1987). Stimulus class membership established via stimulus–reinforcer relations. Journal of the Experimental Analysis of Behavior, 47, 159–175.

Dyer, K., Christian, W. P., & Luce, S. C. (1982). The role of response delay in improving the discrimination performance of autistic children. Journal of Applied Behavior Analysis, 15, 231–240.

Eikeseth, S., & Hayward, D. W. (2009). The discrimination of object names and object sounds in children with autism: a procedure for teaching verbal comprehension. Journal of Applied Behavior Analysis, 42, 807–812. https://doi.org/10.1901/jaba.2009.42-807.

Geiger, K. B., Carr, J. E., LeBlanc, L. A., Hanney, N. M., Polick, A. S., & Heinicke, M. R. (2012). Teaching receptive discriminations to children with autism: a comparison of traditional and embedded discrete trial teaching. Behavior Analysis in Practice, 5, 49–59.

Goeters, S., Blakely, E., & Poling, A. (1992). The differential outcomes effect. The Psychological Record, 42, 389–411.

Green, G. (2001). Behavior analytic instruction for learners with autism: advances in stimulus control technology. Focus on Autism and Other Developmental Disabilities, 16, 72–85. https://doi.org/10.1177/108835760101600203

Grow, L. L., & LeBlanc, L. (2013). Teaching receptive language skills: recommendations for instructors. Behavior Analysis in Practice, 6, 56–75.

Grow, L. L., Carr, J. E., Kodak, T. M., Jostad, C. M., & Kisamore, A. N. (2011). A comparison of methods for teaching receptive labeling to children with autism spectrum disorders. Journal of Applied Behavior Analysis, 44, 475–498. https://doi.org/10.1901/jaba.2011.44-475 .

Grow, L. L., Kodak, T., & Carr, J. E. (2014). A comparison of methods for teaching receptive labeling to children with autism spectrum disorders: a systematic replication. Journal of Applied Behavior Analysis, 47, 600–605. https://doi.org/10.1002/jaba.141 .

Holmes, E. J., Eikeseth, S., & Schulze, K. A. (2015). Teaching individuals with autism receptive labeling skills involving conditional discriminations: a comparison of mass trial and intermixing before random rotation, random rotation only, and combined blocking. Research in Autism Spectrum Disorders, 11, 1–12. https://doi.org/10.1016/j.rasd.2014.11.013.

Kassuba, T., Menz, M. M., Röder, B., & Siebner, H. R. (2013). Multisensory interactions between auditory and haptic object recognition. Cerebral Cortex, 23, 1097–1107. https://doi.org/10.1093/cercor/bhs076

Kerr, N.,Meyerson,L., Flora,J., Tharinger, D., Schallert, D.,Casey, L.,& Fehr, M. J. (1977). The measurement of motor, visual, and auditory discrimination skills in mentally retarded children and adults and in young normal children. Rehabilitation Psychology, 24, 91–206. https://doi.org/10.1037/h0090912 .

Kodak, T., Clements, A., Paden, A. R., LeBlanc, B., Mintz, J., & Toussaint, K. A. (2015). Examination of the relation between an assessment of skills and performance on auditory–visual conditional discriminations for children with autism spectrum disorder. Journal of Applied Behavior Analysis, 48, 52–70. https://doi.org/10.1002/jaba.160

Lamela, L., & Tincani, M. (2012). Brief wait time to increase response opportunity and correct responding of children with autism spectrum disorder who display challenging behavior. Journal of Developmental and Physical Disabilities, 24, 559–573. https://doi.org/10.1007/s10882-012-9289-x

Lind, J., Enquist, M., & Ghirlanda, S. (2015). Animal memory: a review of delayed matching-to-sample data. Behavioural Processes, 117, 52–58. https://doi.org/10.1016/j.beproc.2014.11.019 .

Litt, M. D., & Schreibman, L. (1981). Stimulus-specific reinforcement in the acquisition of receptive labels by autistic children. Analysis and Intervention in Developmental Disabilities, 1, 171–186. https://doi.org/10.1016/0270-4684(81)90030-6

Lovaas, O. I. (1987). Behavioral treatment and normal educational and intellectual functioning in young autistic children. Journal of Consulting and Clinical Psychology, 55, 3–9. https://doi.org/10.1037/0022-006X.55.1.3

Lovaas, O. I. (2003). Teaching individuals with developmental delays: basic intervention techniques. Austin, TX: PRO-ED.

Lund, S. K. (2004). Selection-based imitation: a tool skill in the development of receptive language in children with autism. The Behavior Analyst Today, 5, 27–38. https://doi.org/10.1037/h0100132 .

Majdalany, L. M., Wilder, D. A., Greif, A., Mathisen, D., & Saini, V. (2014). Comparing massed-trial instruction, distributed-trial instruction, and task interspersal to teach tacts to children with autism spectrum disorders. Journal of Applied Behavior Analysis, 47, 657–662. https://doi.org/10.1002/jaba.149 .

Marion, C., Vause, T., Harapiak, S., Martin, G. L., Yu, C. T., Sakko, G., & Walters, K. L. (2003). The hierarchical relationship between several visual and auditory discriminations and three verbal operants among individuals with developmental disabilities. Analysis of Verbal Behavior, 19, 91–105.

McEachin, J. J., Smith, T., & Lovaas, O. I. (1993). Long-term outcome for children with autism who received early intensive behavioral treatment. American Journal on Mental Retardation, 97, 359–372.

McGee, G. G., Krantz, P. J., Mason, D., & McClannahan, L. E. (1983). A modified incidental-teaching procedure for autistic youth: acquisition and generalization of receptive object labels. Journal of Applied Behavior Analysis, 16, 329–338.

McLay, L. K., Sutherland, D., Church, J., & Tyler-Merrick, G. (2013). The formation of equivalence classes in individuals with autism spectrum disorder: a review of the literature. Research in Autism Spectrum Disorders, 7, 418–431. https://doi.org/10.1016/j.rasd.2012.11.002.

Pelios, L. V., & Sucharzewski, A. (2004). Teaching receptive language to children with autism: a selective overview. The Behavior Analyst Today, 4, 378–385. https://doi.org/10.1037/h0100123.

Pérez-González, L. A., & Williams, G. (2002). Multicomponent procedure to teach conditional discriminations to children with autism. American Journal on Mental Retardation, 107, 293–301.

Pérez-González, L. A., Cereijo-Blanco, N., & Carnerero, J. J. (2014). Emerging tacts and selections from previous learned skills: a comparison between two types of naming. Analysis of Verbal Behavior, 30, 184–192. https://doi.org/10.1007/s40616-014-0011-1.

Reid, D. H., DiCarlo, C. F., Schepis, M. M., Hawkins, J., & Stricklin, S. (2003). Observational assessment of toy preferences among young children with disabilities in inclusive settings: efficiency analysis and comparison with staff opinion. Behavior Modification, 27, 233–250. https://doi.org/10.1177/0145445503251588 .

Sakko, G., Martin, T. L., Vause, T., Martin, G. L., & Yu, C. T. (2004). Visual–visual nonidentity matching assessment: a worthwhile addition to the assessment of basic learning abilities test. American Journal on Mental Retardation, 109, 44–52. https://doi.org/10.1352/0895-8017(2004)109<44:VNMAAW>2.0.CO;2.

Sallows, G. O., & Graupner, T. D. (2005). Intensive behavioral treatment for children with autism: four-year outcome and predictors. American Journal on Mental Retardation, 110, 417–438.

Saunders, K. J., & Spradlin, J. E. (1989). Conditional discrimination in mentally retarded adults: the effect of training the component simple discriminations. Journal of the Experimental Analysis of Behavior, 52, 1–12.

Sidman, M. (1986). Functional analysis of emergent verbal classes. In M. D. Zeiler & T. Thompson (Eds.), Analysis and integration of behavioral units (pp. 213–245). New York, NY: Erlbaum.

Sidman, M. (2008). Reflections on stimulus control. The Behavior Analyst, 31, 127–135.

Sidman, M. (2010). Reply to commentaries on “remarks” columns. Behavior and Philosophy, 38, 179–197.

Simpson, K., & Keen, D. (2010). Teaching young children with autism graphic symbols embedded within an interactive song. Journal of Developmental and Physical Disabilities, 22, 165–177. https://doi.org/10.1007/s10882-009-9173-5.

Simpson, K., Keen, D., & Lamb, J. (2013). The use of music to engage children with autism in a receptive labelling task. Research in Autism Spectrum Disorders, 7, 1489–1496. https://doi.org/10.1016/j.rasd.2013.08.013.

Simpson, K., Keen, D., & Lamb, J. (2015). Teaching receptive labelling to children with autism spectrum disorder: a comparative study using infant-directed song and infant-directed speech. Journal of Intellectual and Developmental Disability, 40, 126–136. https://doi.org/10.3109/13668250.2015.1014026.

Smith, T. (1994). Improving memory to promote maintenance of treatment gains in children with autism. The Psychological Record, 44, 459–473.

Tincani, M., & Crozier, S. (2008). Comparing brief and extended waittime during small group instruction for children with challenging behavior. Journal of Behavioral Education, 17, 63–78. https://doi.org/10.1007/s10864-008-9063-4.

Urcuioli, P. J. (2005). Behavioral and associative effects of differential outcomes in discrimination learning. Learning & Behavior, 33, 1–21. https://doi.org/10.3758/BF03196047.

Valcante, G., Roberson, W., Reid, W. R., & Wolking, W. D. (1989). Effects of wait-time and intertrial interval durations on learning by children with multiple handicaps. Journal of Applied Behavior Analysis, 22, 43–55. https://doi.org/10.1901/jaba.1989.22-43.

Varella, A. B., & de Souza, D. G. (2014). Emergence of auditory–visual relations from a visual–visual baseline with auditory-specific consequences in individuals with autism. Journal of the Experimental Analysis of Behavior, 102, 139–149. https://doi.org/10.1002/jeab.93

Vedora, J., & Grandelski, K. (2015). A comparison of methods for teaching receptive language to toddlers with autism. Journal of Applied Behavior Analysis, 48, 188–193. https://doi.org/10.1002/jaba.167

Volkert, V. M., Lerman, D. C., Trosclair, N., Addison, L., & Kodak, T. (2008). An exploratory analysis of task-interspersal procedures while teaching object labels to children with autism. Journal of Applied Behavior Analysis, 41, 335–350. https://doi.org/10.1901/jaba.2008.41-335.

Yoder, P., Watson, L. R., & Lambert, W. (2015). Value-added predictors of expressive and receptive language growth in initially nonverbal preschoolers with autism spectrum disorders. Journal of Autism and Developmental Disorders, 45, 1254–1270. https://doi.org/10.1007/s10803-014-2286-4.

Zaine, I., Domeniconi, C., & de Rose, J. C. (2014). Simple and conditional discrimination and specific reinforcement in teaching reading: an intervention package. Analysis of Verbal Behavior, 30, 193–204. https://doi.org/10.1007/s40616-014-0010-2.

Replication of a skills assessment for auditory–visual conditional discrimination training

Jan 5, 2026

Tiffany Kodak
Marquette University

Samantha Bergmann
University of North Texas

Maria Clara Cordeiro
Marquette University

Abstract

Auditory–visual conditional discrimination training (e.g., receptive identification training, listener responses; AVCD) is ubiquitous in early intervention and special education programs. Nevertheless, some learners with Autism Spectrum Disorder (ASD) do not appear to benefit from this training despite use of empirically validated treatments. To prevent exposure to extended training that does not lead to learning, a skills assessment that measures skills related to AVCD training will be useful for educators and practitioners. The current study replicated the skills assessment developed and evaluated by Kodak et al. (2015) with 8 participants with ASD who received behavior analytic intervention that included at least 1 goal related to AVCD training. Two of the 8 participants mastered all skills included in the assessment except scanning. In addition, 5 participants’ responding failed to reach mastery during subsequent exposure to AVCD training, which further demonstrated the predictive utility of the skills assessment.

Key words: auditory–visual conditional discrimination training, autism spectrum disorder, listener responses, receptive identification, skills assessment

We thank Erica Dashow, Stacy Lauderdale-Litten, Ethan Eisdorfer, David Singer, Audrey Torricelli, Melanie Erwinski, Courtney Meyerhofer, Abigail Stoppleworth, and Daniela Silva for their assistance with data collection and analysis.
Address correspondence to: Dr. Tiffany Kodak, 525 N. 6th St., Milwaukee, WI 53203.
Email: tiffany.kodak@marquette.edu
doi: 10.1002/jaba.909
© 2022 Society for the Experimental Analysis of Behavior

Throughout the day, people encounter thousands of stimuli that require some type of differential responding. For example, a visual discrimination occurs when a parent who picks up her child from daycare approaches and hugs her own child rather than approaching and hugging an unknown child. Auditory discriminations also frequently occur, such as when a parent answers her phone when she hears the programmed daycare ring tone but not when she hears the standard ring tone. Due to the prevalence of these stimuli and the necessity of accurate differential responding to interact with our environment, behavior analysts frequently teach discriminations to individuals who do not acquire them through daily interactions and natural learning opportunities.
Children with Autism Spectrum Disorder (ASD) who receive comprehensive behavioral intervention are frequently exposed to discrimi-nation training (Green, 2001; Grow & LeBlanc, 2013; LaMarca & LaMarca, 2018). One type of discrimination that is targeted is a simple discrimination, which includes a threeterm contingency (i.e., antecedent, behavior, and consequence). For example, an adult may say “clap”, the child claps his hands, and the adult provides enthusiastic praise. Another, more advanced type of discrimination taught to children with ASD is conditional discrimination. Conditional discrimination includes a four-term contingency in which a conditional stimulus and an antecedent stimulus collectively occasion a response which produces reinforcement. For example, a parent and child might look through a picture book of common animals, and the parent turns the page to reveal animals that live at a zoo. While the child is looking at the pictures (antecedent stimuli), the parent asks, “Where is the lion?” (the conditional stimulus), which alters the function of one of the pictures on the page. In this example, the picture of the lion becomes a discriminative stimulus and the other pictures become s-deltas. When the child touches the picture of the lion, the parent provides praise (e.g., “You found the lion! You are so smart.”).

Auditory-visual conditional discrimination (AVCD), as described in the previous example, includes an auditory sample stimulus (e.g., “Where is the lion?”) and visual antecedent stimuli (e.g., an array of pictures including a lion, zebra, monkey, hippo, and elephant). Many everyday tasks require an AVCD, and this type of discrimination (also referred to as receptive identification training, receptive labeling, and listener responses) is frequently targeted in comprehensive behavioral intervention programs and included in many early intervention curriculum manuals (e.g., Leaf & McEachin, 1999; Lovaas, 2003; Maurice et al., 2001). In addition, AVCD training commonly occurs in special education settings and is included as an Individualized Education Program goal for most students with ASD (Kodak et al., 2018).

Despite the prevalence of AVCD training in early intervention and special education programs for children with ASD, a proportion of these children may not readily acquire these discriminations during training. Special education teachers reported difficulty in establishing these discriminations in a proportion of students with ASD (Kodak et al., 2018). In addition, Kodak et al. (2015) showed that 44% of participants with ASD did not acquire AVCD during exposure to typically efficacious intervention procedures. The authors developed a skills assessment to measure behavior that is related to AVCD. For example, during AVCD training, a learner should attend to and differentially respond to auditory stimuli (e.g., “cat,” “dog,” or “bird” across trials), scan and differentially respond to an array of visual stimuli (e.g., pictures of a cat, dog, and bird), match the auditory sample stimulus to the corresponding visual stimulus (i.e., match “cat” with a picture of a cat despite no physical similarity of stimuli), and respond to prompts delivered by the instructor (e.g., a model prompt) prior to and following errors. Therefore, the component discriminations required of the learner include an auditory discrimination (between samples), visual discrimination (between pictures in the array), and nonidentity matching (matching auditory stimulus to visual stimulus; Kerr et al., 1977).

Kodak et al. (2015) evaluated whether the outcomes of unsuccessful AVCD training were correlated with outcomes on a skills assessment that measured potential prerequisite skills for AVCD. Their skills assessment measured responding in five conditions: identity matching, imitation, visual discrimination, auditory discrimination, and scanning. Four of the nine participants did not engage in mastery level responding during one or more of the conditions included in the skills assessment. Following the skills assessment, all nine participants received AVCD training. Kodak et al. found that the skills assessment was predictive of successful AVCD training for all five participants who showed mastery of all conditions in the skills assessment. Further, the skills assessment accurately predicted unsuccessful AVCD training for two of the four participants who failed to master one or more conditions in the assessment. Thus, the skills assessment accurately differentiated between seven of the nine participants who may or may not acquire AVCD when exposed to efficacious AVCD training procedures without substantial procedural modifications.

Although the results of Kodak et al. (2015) provide initial evidence for the utility of an assessment to measure skills related to AVCD training, the assessment was not as predictive of unsuccessful AVCD training for two of the four participants who failed to master one or more conditions of the skills assessment. One participant (Amar) reached mastery during AVCD training without substantial procedural modifications despite failing to reach mastery of the auditory discrimination in the skills assessment. The other participant (Hal) reached mastery during AVCD training following substantial procedural modifications (lengthy exposure to blocking) despite failing to master three of the five conditions in the skills assessment. Therefore, additional replications with learners who are unlikely to master one or more conditions in the skills assessment are needed.

In clinical practice, the application of a skills assessment could prevent exposing learners to AVCD training that is unlikely to produce positive treatment outcomes. In addition, a skills assessment could identify component skills to teach before AVCD instruction, similar to other assessments used in comprehensive behavioral intervention (e.g., Verbal Behavior Milestones Assessment and Placement Program, VB-MAPP; Sundberg, 2008). For these reasons, the skills assessment is most useful for identifying learners who are receiving behavior analytic services and are unlikely to benefit from AVCD training, and replication of this skills assessment with additional learners with ASD is necessary.

The purpose of the current study was to improve upon some conditions of the skills assessment developed by Kodak et al. (2015) and conduct the skills assessment with additional participants with ASD. A modification was made to one condition in the skills assessment to reduce the likelihood that response biases influenced the assessment results. Kodak et al. reported that several participants showed an error pattern to respond to the stimulus card in the go/no-go procedure arranged in the auditory- discrimination condition. That is, instead of touching a stimulus card in the presence of one sound (e.g., a “go” response) and refraining from touching the stimulus card in the presence of a second sound (e.g., a “no-go” response), several participants in Kodak et al. consistently touched the stimulus card in the presence of both sounds. This error pattern may be common for individuals with a history of reinforcement for touching stimuli on tabletops (Bergmann et al., 2021; Serna, 2016; Serna et al., 2009). The current assessment replaced the go/no-go procedure with an auditory-discrimination condition (auditory identity matching) shown to establish auditory discrimination in children with ASD by Bergmann et al. (2021) and included three stimuli in the array to reduce the likelihood of stimulus and position biases (Green, 2001; Grow & LeBlanc, 2013).

Method Participants, Setting, and Materials

Participants included eight children or adolescents who were diagnosed with ASD by a professional not associated with this study. Refer to Table 1 for participants’ ages. All participants attended a center-based intervention program and had educational or treatment goals related to acquisition of AVCD. In addition, six participants (Ryan, Ben, Roger, Hank, Doug, and Gina) had previous exposure to unsuccessful AVCD training with empirically supported practices (e.g., blocking, instruction with error correction and differential reinforcement, a simple-conditional method) in their school and center-based intervention program. Those same six participants had little to no unctional vocal-verbal behavior and used a picture-based communication system to request
items from others. Two participants (Lance and Arthur) communicated in full sentences and mastered most or all verbal behavior milestones measured in the VB-MAPP (Sundberg, 2008).
None of the participants had known impairments in hearing and vision (e.g., they selected preferred items from an array, they oriented toward sounds). Gina did not participate in AVCD training due to an extended absence after completion of the skills assessment.

We conducted sessions in a private room that contained a table, chairs, relevant session materials (e.g., data sheets, picture cards, BIGmack® buttons), preferred items, the participant’s augmentative communication device (e.g., binder with icons), and a video camera for data collection.

The stimuli selected for inclusion in the skills assessment and AVCD training were consistent across five participants and AVCD training targets were individualized for three participants; refer to Tables 2 and 3 in the online supporting information for a list of stimuli assigned to each condition. Stimuli included in each condition were visually discrepant (e.g., different colors and shapes) and had minimal overlap in sounds (e.g., stimuli did not start nor end with the same sound; Gast, 2010). Finally, none of the participants had prior exposure to instruction with the stimuli included in AVCD training during their behavior analytic services.

Table 1
Summary of Participants’ Ages and Outcomes of Conditions in the Skills Assessment and Auditory–Visual Conditional Discrimination Training

Participant	Age	IDM	Imit	VD	AD	Scan	AVCD Training
Ryan	9y, 2 m	+	+	–	–	–	–
Ben	13y, 0 m	+	+	+	–	+	–
Roger	5y, 8 m	+	+	+	–	–	–
Hank	11y, 4 m	+	+	+	–	–	–
Doug	7y, 8 m	+	+	+	–	–	–
Lance	9y, 8 m	+	+	+	+	–	+
Arthur	12 y, 9 m	+	+	+	+	–	+
Gina	8y, 8 m	+	+	+	–	–	N/A

Note. Responding that met (+) or did not meet (-) the mastery criterion during skill-assessment conditions and AVCD training. IDM = Identity-matching condition; Imit = Imitation condition; VD = Visual-discrimination condition; AD = Auditory-discrimination condition; Scan = Scanning condition; AVCD = Auditory–visual conditional

Response Measurement, Interobserver Agreement, and Procedural Integrity

The primary dependent variables in the skills assessment and AVCD training were correct responses and scanning (skills assessment only). A correct response was defined as touching the target stimulus in the array, depressing the target BIGmack^®button in the array, or placing a picture card on top (Ryan and Doug) or in front (Ben) of the identical picture card in the array within 5 s of the initiation of the trial. Scanning was defined as an uninterrupted shift in the participant’s eye gaze from one stimulus to the next. The experimenter converted each measure to a percentage by dividing the number of trials with an occurrence of the behavior by the total number of trials in a session, multiplied by 100.

Observers also collected data on secondary dependent variables, including prompted responses, errors, and no responses during trials, although these data are not displayed in the figures. A prompted response was defined as touching the target stimulus in the array, depressing the target BIGmack^®button such that it played the auditory stimulus, or placing a picture card on top or in front of an identical stimulus in the array within 5 s of a model prompt or when physically guided. An error was defined as touching any stimulus other than the target stimulus in the array, depressing a BIGmack^®button other than the target stimulus in the array, or placing a picture card on top or in front of a nonidentical stimulus in the array within 5 s of the initiation of the trial. A no response was defined as the participant failing to engage in a response to the stimuli in the array within 5 s of the initiation of the trial.

Two observers independently recorded data on each dependent variable during 60% to 100% of sessions of the skills assessment and 35% to 100% of sessions of AVCD training. The experimenter calculated agreement using a trial-by-trial method. Agreement for each trial occurred when both observers recorded the exact same dependent variable(s). Interobserver agreement was calculated by dividing the number of trials with an agreement by the total number of trials in a session and multiplying by 100. Mean agreement for all dependent variables in the skills assessment was 100% for Ryan, 93.4% (range, 67% to 100%) for Ben, 90.9% (range, 58% to 100%) for Roger, 91.8% (range, 75% to 100%) for Hank, 97.6% (range, 95% to 100%) for Doug, 98.5% for Lance (range, 91.7% to 100%), 100% for Arthur, and 99.7% for Gina (range, 97.9% to 100%). Mean agreement for AVCD training was 99.2% (range, 92% to 100%) for Ryan, 100% for Ben, 98.4% (range, 83% to 100%) for Roger, 98.3% (range, 92% to 100%) for Hank, 99.2% (range, 92% to 100%) for Doug, 100% for Lance, and 100% for Arthur. Lower percentages in some sessions of the skills assessment were often a result of differences in scanning data.

An observer also collected data on the experimenter’s procedural integrity during the 41% to 100% of skills assessment sessions and 34% to 100% of the AVCD training sessions. Procedural integrity was defined as the experimenter implementing all aspects of the protocol exactly as written during each trial (e.g., presenting the stimulus/stimuli as indicated for the trial, securing attending, waiting the allotted response interval, delivering the correct consequences). The observers scored integrity during each trial as either a 1 (all components of the trial were implemented correctly) or 0 (one or more of the trial components were not implemented correctly). The experimenter calculated the percentage of procedural integrity by dividing the number of trials with a score of 1 by the total number of trials in a session and multiplying by 100. Mean procedural integrity for the skills assessment was 98.7% (range, 96% to 100%) for Ryan, 100% for Ben, 96.7% (range, 68% to 100%) for Roger, 99.5% (range, 98% to 100%) for Hank, 99.4% (range, 97% to 100%) for Doug, 100% for Lance, 100% for Arthur, and 96.9% (range, 83.3% to 100%) for Gina. Roger’s lower level of integrity occurred during the identity-matching condition in which the experimenter did not consistently say “match” prior to a response opportunity. Mean procedural integrity for AVCD training was 97.9% (range, 83% to 100%) for Ryan, 99.4% (range, 92% to 100%) for Ben, 100% for Roger, 88.3% (range, 50% to 100%) for Hank, 99.4% (range, 92% to 100%) for Doug, 100% for Lance, and 100% for Arthur. Hank’s lower percentage of integrity occurred during baseline when he did not scan the array independently, and the experimenter secured attending but did not say “look” before pointing to each stimulus and providing brief praise for prompted looking.

Preference Assessment

A brief multiple stimulus without replacement (MSWO) preference assessment (Carr et al., 2000) was conducted prior to each session. The first item selected by the participant was provided contingent on correct responses during trials. Participant mands for an alternative item during sessions (using their augmentative communication system or vocal mands) were honored.

Skills Assessment

All procedures in the skills assessment were based on those used by Kodak et al. (2015). In all conditions, the experimenter presented an array of three stimuli (rather than two; Kodak et al., 2015) and stimuli were rotated across trials. This modification to the array size was made to decrease the likelihood of establishing a position bias (Green, 2001) and to align the array size with current AVCD practice recommendations (e.g., Grow & LeBlanc, 2013; LaMarca & LaMarca, 2018). Representative pictures of the stimulus arrangement in each condition are available in Supporting Information.

All conditions included 12 trials per session. Two to six sessions were conducted per day, 3 to 5 days per week. A multielement design was used to evaluate the effects of each condition in the skills assessment on correct responses. The conditions included identity matching, imitation, visual discrimination, auditory discrimination, and scanning. Skills assessment conditions alternated in a semirandom order; we conducted the same condition no more than two sessions in a row. The mastery criterion for each condition in the skills assessment was two consecutive sessions with at least 80% correct responses. Once the participant’s responding reached mastery in one condition in the skills assessment, sessions of the remaining conditions continued in semirandom alternation until either every condition was mastered or a condition reached the 10-session discontinuation criterion (Kodak et al., 2015).

During trials, the experimenter placed three picture cards or four BIGmack^®buttons on the table in front of the participant. The location of the target stimulus in the array was randomly rotated during each trial so that target stimuli were placed in each position in the array an equal number of times per session.

Identity Matching

The experimenter placed three pictures in a horizontal array in front of the participant, either handed the participant a picture card (Ryan, Ben, and Doug) or held up a picture card (Roger, Hank, Lance, Arthur, and Gina), and said “match.” If the participant placed the picture card on top (Ryan and Doug) or in front (Ben) of the corresponding picture in the array, or if the participant touched the corresponding picture in the array (Roger, Hank, Lance, Arthur, and Gina), the experimenter provided enthusiastic praise and an edible or 20-s access to a tangible item. Following an error or no response within 5 s, the experimenter cleared the array and presented the next trial. Scanning was measured during each trial.

Imitation

The experimenter placed three pictures in a horizontal array on the table in front of the participant and said “do this” while pointing at one of the pictures. If the participant imitated the experimenter’s behavior by pointing to the same picture in the array, the experimenter provided enthusiastic praise and an edible or 20-s access to a tangible item. If the participant pointed to a different picture or did not respond within 5 s, the experimenter cleared the array and moved to the next trial. The target picture and location in the array changed across trials.

Visual Discrimination

The experimenter placed three pictures on the table in front of the participant. One picture served as the target stimulus in all trials in each session (e.g., a horse), and the array remained constant (i.e., the same three pictures were presented in every trial although the position rotated across trials). The first session included a 0-s prompt delay in which the experimenter immediately provided a model prompt or physically guided (Ryan only) the participant to touch the target stimulus. Correct prompted responses produced enthusiastic praise and an edible or 20-s access to a tangible item. The purpose of the 0-s delay session was to provide exposure to the contingency for correct responses to the target stimulus. The remaining sessions of this condition did not include prompts. The participant had up to 5 s to respond to the target stimulus in the array. Correct responses produced praise and an edible or 20-s access to a tangible item, and an error or no response within 5 s resulted in removal of the array and the end of the trial.

Auditory Discrimination

Due to a bias that some participants showed to touching a blank white card placed on the table in every trial regardless of the auditory stimulus presented, the procedures for this condition were modified from those of Kodak et al. (2015). Instead of measuring a go/no-go auditory discrimination (i.e., touching a blank white card in the presence of sound A [go trials] and refraining from touching a blank white card in the presence of sound B [no-go trials]), the auditory discrimination assessed in this condition was auditory– auditory matching (Bergmann et al., 2021).

One auditory stimulus served as a target in each trial, although three auditory stimuli rotated as the target stimulus across trials. The array remained constant (i.e., the same three auditory stimuli were presented in every trial) although the locations of buttons rotated across trials. The experimenter placed one BIGmack^®button in front of herself (i.e., the sample stimulus) and three BIGmack^®buttons in a horizontal array in front of the participant (i.e., the comparison array). The experimenter pressed her button to play the auditory sample stimulus. Immediately thereafter, the experimenter pressed each button in the array from left to right, allowing the auditory stimulus to finish playing before pressing the next button in the array. The experimenter said “match” and then activated the sample stimulus again. The first session of this condition included a 0-s prompt delay in which the experimenter immediately provided a model prompt (Doug only) or physically guided the participant to press the button in the array that played the auditory stimulus that matched the sample and delivered enthusiastic praise and an edible or 20-s access to a tangible item following correct prompted responses. During all remaining sessions, the experimenter did not provide prompts, and the participant had 5 s to engage in the correct response. Contingent on a correct response, the experimenter provided enthusiastic praise and an edible or 20-s access to a tangible item. An error or no response within 5 s resulted in removal of the array and the end of the trial.

Scanning

Data on scanning were consistent across identity-matching and imitation conditions in Kodak et al. (2015). Therefore, observers collected data on scanning during sessions of the identity-matching condition only to increase the feasibility of this measure for data collectors. No praise, edibles, nor preferred items were provided contingent on scanning.

AVCD Training

Training began after the completion of the skills assessment. The current AVCD training differed from the procedures used by Kodak et al. (2015) who exposed participants to sequential or alternating training procedures (e.g., differential reinforcement without prompts, position prompts, model prompts). We selected a prompt-delay procedure during training, because prompt delays in prior studies produced mastery level responding for some participants with ASD who had difficulty acquiring AVCDs (Grow et al., 2011) or verbal conditional discriminations (Kisamore et al., 2016).

A nonconcurrent multiple baseline across participants design was used to evaluate the effects of intervention on correct responses during AVCD training. Training continued until the participant’s responding met the mastery criterion of two consecutive sessions with at least 80% correct responses or the training discontinuation criterion was reached. To prevent extended exposure to ineffective training procedures, if the participant’s responding did not reach the mastery criterion or show an increasing trend following 15 sessions of training (not including two sessions conducted at 0-s prompt delay), the experimenter discontinued training. If an increasing trend in correct responding was observed after 15 training sessions (Ben only), then the experimenter conducted an additional 15 sessions of training (30 sessions total) before discontinuing training. The experimenters selected a 15-session discontinuation criterion for training based on outcomes of the duration of AVCD training in Kodak et al. (2015) and a similar criterion used by Kisamore et al. (2016).

Sessions included 12 trials. During all trials, the experimenter placed three picture cards in a horizontal array in front of the participant. The experimenter required the participant to look at each stimulus in the array (either independently or following a prompt to “look” while pointing at each stimulus), and delivered the auditory sample stimulus (e.g., “shell”). That is, participants were required to scan the array prior to the presentation of the auditory sample stimulus and an opportunity to select a comparison stimulus. The participant had up to 5 s to engage in a response after the delivery of the auditory sample stimulus.

Baseline

If the participant engaged in a correct response, the experimenter provided enthusiastic praise only (except Arthur). Because reinforcers were omitted for correct responses in baseline, Arthur did not receive enthusiastic praise following correct responses due to praise functioning as a reinforcer. An error or no response within 5 s resulted in the removal of the array and end of the trial. Mastered tasks were interspersed approximately every three trials to maintain participant responding (Bergmann et al., 2021; Halbur et al., 2021). Correct responses to mastered tasks produced enthusiastic praise and an edible or 20-s access to a tangible item. An error or no response to mastered tasks resulted in a prompt and the presentation of a different mastered task to which a correct response produced reinforcement.

Training

The procedures were similar to baseline, except mastered tasks were not interspersed during training, and prompts and reinforcement were included in trials. The first two sessions of training included a 0-s delay to a prompt (data not included in the figures). The experimenter presented the auditory sample stimulus and immediately provided a model prompt. If a prompted response occurred within 5 s of the prompt, the experimenter delivered enthusiastic praise and an edible or 20-s access to a tangible item. Thereafter, the experimenter implemented a 5-s prompt delay in which the participant had 5 s to engage in a correct response. Following an error or no response within 5 s, the experimenter repeated the conditional stimulus and provided a model prompt and provided praise and an edible or 20-s access to a tangible item following a prompted response. If the participant did not engage in a prompted response following the model prompt, the experimenter provided physical guidance and delivered praise only. Training continued until correct responding met the mastery criterion or reached the discontinuation criterion (i.e., 15 sessions of training with the 5-s prompt delay with no increasing trend in correct responding).

Results

Results of the skills assessment for all eight participants are displayed in Figures 1-3 and Table 1. Ryan met the mastery criterion in the identity-matching and imitation conditions (Figure 1, top panel). Ryan’s percentage of trials with scanning during the identity-matching condition was relatively low; nevertheless, his correct responses during identity matching sessions suggest that he was attending to the visual stimuli in a sufficient manner. Data collection on scanning was discontinued prior to reaching the mastery criterion, because the identity-matching condition was no longer conducted following mastery. Ryan’s responding met the discontinuation criterion in the visual-discrimination and auditory-discrimination conditions.

Ben’s responding met the mastery criterion in the identity-matching, scanning, visual-discrimination, and imitation conditions (Figure 1, middle panel). However, his responding met the discontinuation criterion in the auditory-discrimination condition.

Roger’s responding met the mastery criterion in the imitation, identity-matching, and visual-discrimination conditions (Figure 1, bottom panel). Similar to Ryan, Roger also had low and variable levels of scanning during identity matching, and data collection was discontinued following mastery of identity matching. Roger’s responding met the discontinuation criterion in the auditory-discrimination condition.

Figure 1: Results of the Skills Assessment for Ryan, Ben, and Roger

Note. The dotted line represents the percentage required for mastery.

Hank’s responding met the mastery criterion in the visual-discrimination, identity-matching, and imitation conditions in the minimum number of sessions (Figure 2, top panel). However, similar to Ryan and Roger, Hank displayed low levels of scanning during identity matching, and data collection on scanning was discontinued following mastery of identity matching. Hank’s responding met the discontinuation criterion in the auditory-discrimination condition.

Doug’s responding met the mastery criterion in imitation, identity-matching, and visual-discrimination conditions (Figure 2, bottom panel). However, he also had low levels of scanning behavior during identity matching, and data collection on scanning was discontinued following mastery of identity matching. Doug’s correct responding was variable and met the discontinuation criterion in the auditory-discrimination condition.

Figure 2: Results of the Skills Assessment for Hank and Doug

Note. The dotted line represents the percentage required for mastery.

Lance’s responding met the mastery criterion in imitation, identity-matching, and visual-discrimination, and auditory-discrimination conditions (Figure 3, top panel). However, he also had low levels of scanning behavior during identity matching, and data collection on scanning was discontinued following mastery of identity matching.

Arthur’s responding met the mastery criterion in imitation, identity-matching, visual-discrimination, and auditory-discrimination conditions (Figure 3, middle panel). Similar to Doug and Lance, Arthur had low levels of scanning behavior during identity matching, and data collection on scanning was discontinued following mastery of identity matching.

Gina’s responding met the mastery criterion in imitation, identity-matching, and visual-discrimination conditions (Figure 3, bottom panel). Similar to other participants, she had low levels of scanning behavior during identity matching, and data collection on scanning was discontinued following mastery of identity matching. Gina’s correct responding was low and met the discontinuation criterion in the auditory-discrimination condition.

Figure 3
Results of the Skills Assessment for Lance, Arthur, and Gina

Note. The dotted line represents the percentage required for mastery.

Overall, the skills assessment showed that none of the eight participants demonstrated mastery of all five conditions. However, two participants (Lance and Arthur) met mastery in all conditions except scanning. Four of the participants’ responding failed to meet the mastery criterion in at least two conditions (Ryan, Roger, Hank, and Doug), and six of the participants did not display mastery level responding in the auditory-discrimination condition.

The results of AVCD training for Ryan, Ben, and Roger are shown in Figure 4 and Table 1. Ryan’s correct responding during baseline was low and at chance level (e.g., 33%; Figure 4, top panel). His responding remained low and variable despite the introduction of training, and he met the discontinuation criterion after 15 sessions without an increasing trend. Ben’s correct responses in baseline were stable and at chance level (Figure 4, middle panel). Ben displayed a bias toward the middle position in the stimulus array. Following 15 sessions of training, Ben’s correct responses showed a gradual increasing trend. Therefore, we continued training for an additional 15 sessions (i.e., 30 sessions total). However, his correct responses stabilized and did not meet the mastery criterion. Roger’s correct responses were low and variable in baseline (Figure 4, bottom panel) and following the introduction of training. Roger’s responding met the discontinuation criterion.

Figure 4
Auditory–Visual Conditional Discrimination (AVCD) Training for Ryan, Ben, and Roger

The results of AVCD training for Hank and Doug are shown in Figure 5 and Table 1. Hank’s correct responses were low and at chance level during baseline (Figure 5, top panel) and training. Hank’s responding met the discontinuation criterion. Doug had low and variable levels of correct responses during baseline (Figure 5, bottom panel). Although he showed an increase in correct responding in the first session of training, Doug’s correct responses decreased and remained at chance level until training was discontinued.

Figure 5
Auditory–Visual Conditional Discrimination (AVCD) Training for Hank and Doug

The results of AVCD training for Lance and Arthur are shown in Figure 6 and Table 1.

Lance’s correct responses increased from zero in baseline to mastery level in just two training sessions following the 0-s prompt delay (0-s prompt delay data not shown in figure). Similarly, Arthur’s correct responses also reached mastery level following two sessions of training.

Overall, the results of AVCD training showed only two of the seven participants (Lance and Arthur) demonstrated mastery of three AVCD targets despite 15 or 30 (Ben only) sessions of training. The training results of two participants (Lance and Arthur) were not accurately predicted by the outcome of their skills assessments, because those participants did not engage in mastery level responding in the scanning condition of the skills assessment. In contrast, the training results of five participants (Ryan, Ben, Roger, Hank, and Doug) were accurately predicted by the outcomes of the participants’ skills assessments, which suggested that those five participants were missing one or more skills that may be necessary for successful AVCD training.

Figure 6: Auditory–Visual Conditional Discrimination (AVCD) Training for Lance and Arthur

Discussion

The results of the skills assessment accurately predicted the outcome of AVCD training for five of the seven participants. In particular, the skills assessment was most accurate in predicting participants who would not benefit from AVCD training. All five participants whose responding failed to meet mastery during AVCD training also had responding that failed to meet mastery in one or more of the conditions in the skills assessment. For the two participants whose responding met mastery during AVCD training (Lance and Arthur), their skills assessment showed mastery of all conditions except scanning. If data on scanning are removed from the analysis, then the skills assessment accurately predicted the outcome of AVCD training for all seven participants.

The predictive utility of the skills assessment in the present study partially replicates the results of Kodak et al. (2015) in regard to the auditory-discrimination condition being a particularly accurate predictor of success with AVCD training. Five participants in Kodak et al. and two participants in the present investigation who demonstrated mastery in the auditory-discrimination condition also demonstrated mastery during AVCD training. Further, two participants in Kodak et al. and five participants in the current investigation who did not demonstrate mastery in the auditory-discrimination condition in the skills assessment also failed to demonstrate mastery during AVCD training. In sum, the results of the auditory-discrimination condition alone accurately predicted AVCD training outcomes for 14 of 17 participants across both studies.

The results of the auditory-discrimination condition are consistent with other measures of auditory discrimination (e.g., echoics, instruction following) that were conducted with participants prior to both studies and showed deficits in this repertoire. For example, five of the participants in the current study (Ryan, Ben, Roger, Hank, and Doug) and three of the participants in Kodak et al. 2015 (Hal, Larry, and Freddy) did not engage in echoic behavior and most showed limited success and faulty stimulus control during programs targeting auditory instruction following (e.g., “come here”). In contrast, three of the participants in the current study (Lance, Arthur, and Gina) and six participants in Kodak et al. (Rose, Linda, Brandan, Wyatt, Josh, and Amar) engaged in echoic behavior and followed one step instructions. Auditory discrimination and echoic behavior, in particular, may be crucial for successful AVCD training due to the necessity of joint control. Lowenkron (1998) described joint control as “… a discrete event, a change in stimulus control that occurs when a response topography, evoked by one stimulus and preserved by rehearsal, is emitted under the additional control of a second stimulus” (p. 332). For example, a learner likely needs to echo the auditory sample stimulus (e.g., repeat “cat” covertly or overtly immediately after the instructor says, “cat”) and engage in self-echoic behavior (e.g., continue repeating “cat” overtly or covertly). The self-echoic behavior preserves the auditory sample stimulus through rehearsal while simultaneously scanning the array of visual stimuli. When the learner sees the picture of the cat in the array and (overtly or covertly) tacts the picture as “cat,” the auditory product of the tact matches the auditory product of the self-echoic behavior and occasions a pointing response to the picture of the cat (Miguel, 2016; Striefel et al., 1976). If a learner cannot yet echo auditory stimuli (as was the case for many participants in the current study and several in Kodak et al., 2015), then the transient nature of the auditory stimulus without rehearsal may decrease accuracy during AVCD training. Although it may be possible to replace the auditory-discrimination condition in the skills assessment with an assessment of echoic behavior (e.g., EESA; Esch, 2008), a non-vocal assessment of auditory discrimination may be a more inclusive measure for all individuals with potential impairments in skills related to AVCD.

The current study modified the procedures in the auditory-discrimination condition from those used by Kodak et al. (2015) to reduce the likelihood of biased responding. Some participants in Kodak et al. engaged in biased responding during a go/no-go procedure by consistently touching the blank white card on the table during all trials rather than only touching the card in the presence of the relevant auditory stimulus. Other researchers also have reported biased responding for individuals with developmental disabilities exposed to the go/no-go procedure (e.g., Bergmann et al., 2021; Serna, 2016; Serna et al., 2009). The current modification arranged an auditory match-to-sample procedure in which participants matched identical auditory stimuli. This procedure was based on Speckman-Collins et al. (2007) and Bergmann et al. (2021) who taught auditory discriminations using auditory match-to-sample procedures to children with ASD. In addition, this procedure is akin to an auditory match-to-sample task included in the Headsprout^®reading program, which has been used successfully with learners with ASD (e.g., Grindle et al., 2013; Plavnik et al., 2016). Nevertheless, the complexity of the auditory match-to-sample discrimination may prevent assessment of an auditory-discrimination repertoire for learners who could respond appropriately to less complex auditory-discrimination procedures (Serna, 2016). Researchers seeking to evaluate alternative auditory-discrimination procedures (e.g., EESA;

Esch, 2008; do this/do that; Bergmann et al., 2021) to include in the skills assessment could compare levels of correct responses across procedures to determine if one method may be ideal for most learners. However, we recommend that any comparison of auditory-discrimination procedures for this skills assessment include an evaluation of performance during AVCD training to examine the correlation of assessment and treatment outcomes.

Unlike most of the participants in Kodak et al. (2015), six of the current participants had already received extended exposure to unsuccessful AVCD training at school and during their behavioral intervention prior to completion of the skills assessment. Extended exposure to a variety of instructional strategies that fail to establish AVCD may be common for a proportion of children with ASD, due to the ubiquity of this type of training in educational settings (Grow & LeBlanc, 2013; Kodak et al., 2018). Learners who have educational goals related to AVCD but do not have relevant repertoires upon which to establish these skills may develop error patterns during training such as response or position biases (Grow et al., 2011; Saunders & Spradlin, 1989; 1990; 1993) that can complicate or hinder subsequent instruction. Participants in the present study displayed some of these error patterns during the skills assessment and AVCD training. For example, the five participants whose responding did not reach mastery during AVCD training showed a position bias in baseline and treatment during AVCD training. Ryan, Ben, Hank, Doug, and Gina also responded to a specific position in the array more often during the auditory-discrimination condition of the skills assessment, and Ryan had a similar bias during a portion of sessions in the visual-discrimination condition.

Consistent error patterns may occur during instruction because it maximizes reinforcement for chance-level responding when the skill has not yet been acquired (Cumming & Berryman, 1961; Kangas & Branch, 2008; Mackay, 1991). For example, responding to the left position on every trial, in an array of three, produces reinforcement during approximately 33% of trials, whereas random responding across positions may result in more variable (and sometimes leaner) schedules of reinforcement. It is possible that persistent error patterns suggest that other, related skills (such as those included in the skills assessment) could be taught or strengthened rather than continuing instruction on the current skill (e.g., AVCD). Additional research is needed to evaluate whether certain response patterns suggest deficits in a skill area that, once resolved, might lead to successful learning.

There are several limitations of the current study that warrant consideration. The experimenters discontinued AVCD training following 15 or 30 treatment sessions (plus two, 0-s prompt delay sessions not included in the figure nor discontinuation criterion). It is possible that responding may have eventually met the mastery criterion following extended training. However, results of Kodak et al. (2015) show that participants’ responding reached mastery criterion in 2 to 11 training sessions, and participants who required 12 or more sessions of instruction (maximum of 18 sessions) did not show increasing levels of correct responses nor have responding that reached the mastery criterion. In addition, other researchers have used a similar discontinuation criterion for conditional discrimination training with a constant-prompt delay (e.g., Kisamore et al., 2016). Our participants were exposed to 60 trials of instruction per stimulus with no increasing trend in correct responses before training was discontinued, whereas Kisamore et al. (2016) determined that 50 trials of verbal conditional discrimination training per stimulus with no increasing trend indicated the training procedure was ineffective and necessitated remedial training.

We could have made modifications to AVCD training (e.g., blocking; Saunders & Spradlin, 1989) which may have resulted in the mastery of AVCD training. However, we replicated an empirically validated training procedure included in Kodak et al. (2015) rather than making additional modifications. Further, we determined that lengthy exposure to training was not clinically indicated for most of the participants because they had already been exposed to several training procedures (e.g., prompt delays with error correction, identity-matching prompts, extensive simple discrimination training, blocking) during unsuccessful instruction on AVCD prior to participation. Further, our clinical goals for most of these participants was to identify skills that were not mastered in the skills assessment that would become the focus of subsequent instruction. Future evaluations of the skills assessment could arrange a sequence of instructional strategies during AVCD training (e.g., Grow et al., 2011) to attempt to produce mastery of targeted conditional discriminations.

Similar to Kodak et al. (2015), the current findings suggest that the definition of scanning may be too stringent and could be revised or excluded. Several of our participants engaged in high levels of correct matching that were contrary to the low levels of scanning measured within the same trials. Rather than engaging in an uninterrupted eye-gaze shift from one stimulus to the next in the array (i.e., our definition of scanning), participants could have looked at each stimulus after brief instances of looking elsewhere (e.g., Lance and Bryce). Observers did not score an instance of scanning if participants shifted their gaze from a comparison stimulus back to the sample stimulus. Although interruptions in scanning may increase the length of time necessary to attend to the array, an efficient scanning response does not appear to be necessary for successful visual identity matching. Researchers who seek to evaluate components of the skills assessment could modify the definition of scanning to allow for interruptions in eye-gaze shift. One may also consider whether it’s necessary to scan all stimuli in the array if the participants’ eye gaze is directed toward the correct stimulus. Anecdotally, the experimenter observed that Arthur frequently scanned the array until he reached the correct comparison stimulus, and then he stopped scanning. If the correct comparison was in the left or middle position in the array, correct scanning was not scored because Arthur did not scan the entire array of comparisons. Nevertheless, we do not recommend omitting a measure of scanning or eye gaze altogether, as persistent patterns of responding without looking at stimuli would likely invalidate the results of the skills assessment.

Finally, the necessity of the skills assessment versus measurement of these skills via other assessments remains unknown. For example, identity matching, imitation, visual discrimination, and some types of auditory discrimination (echoics, instruction following) are measured in the VB-MAPP. Researchers could examine whether measures of these skills in the VBMAPP are similarly or more accurate than the predictive validity of the current skill assessment for success during AVCD training. If specific skills measured in the VB-MAPP could be used to accurately predict whether and when to initiate AVCD training with learners, practitioners will benefit from instruction on the use of the VB-MAPP results for this purpose.

Currently, the skills assessment can only be used as a predictive tool to identify learners who may not benefit from AVCD training. However, an important next step for the assessment is to determine whether there is a functional relationship between the skills measured in the assessment and acquisition of AVCDs. To do so, researchers will need to (1) conduct the skills assessment and identify learners who do not show mastery of one or more skills, (2) conduct AVCD training with a set of stimuli that does not lead to mastery level responding, (3) teach the missing skills from the assessment (this step may include substeps), and (4) repeat AVCD training with the same set of stimuli to determine whether mastery of the previously missing skills subsequently leads to acquisition of AVCD. Due to the length of time required to complete the proposed steps, it will be critical to include a control group of participants who complete the initial skills assessment and fail to acquire AVCDs during training and then repeat AVCD training after a similar length of time but who do not receive instruction on missing skills. Arranging the proposed comparison will evaluate the necessity and sufficiency of the skills measured in the skills assessment for successful AVCD training. Although it is unethical to restrict learning opportunities for clients to conduct the proposed comparison, children with ASD who do not receive intervention services that target the skills included in the assessment can serve as control participants for an initial evaluation of this question.

Despite the prevalence of AVCD in our environment and educational opportunities, children with ASD require behavioral intervention to acquire AVCD. Nevertheless, a proportion of children with ASD who are exposed to empirically validated interventions may fail to acquire this critical repertoire. Rather than exposing them to extended instruction with multiple procedures that do not produce the intended outcomes, behavior analysts must identify and consider skills that relate to targeted intervention goals. The development of relevant skills assessments for repertoires that may be difficult to establish with certain learners can enhance the success of practitioners and advance the science of behavior analysis. Continued research and refinement of the skills assessment in the present investigation will assist in identifying the relevant skills that are necessary and sufficient for success during AVCD training.

References

Bergmann, S., Kodak, T., VanDenElzen, G., Cliett, T., & Benitez, B. (2021). Efficacy and efficiency of auditory discrimination procedures for children with autism spectrum disorder and typical development: A preliminary investigation. European Journal of Behavior Analysis, 22(1), 74–100. https://doi.org/10.1080/15021149.2020.1795556

Cumming, W. W., & Berryman, R. (1961). Some data on matching behavior in the pigeon. Journal of the Experimental Analysis of Behavior, 4(3), 281–284. https://doi.org/10.1901/jeab.1961.4-281

Esch, B. E. (2008). Early echoic skills assessment. AVB Press.

Gast, D. L. (2010). Single subject research methodology in behavioral sciences. Routledge.

Green, G. (2001). Behavior analytic instruction for learners with autism: Advances in stimulus control technology. Focus on Autism and Other Developmental Disabilities, 16(2), 72–85. https://psycnet.apa.org/doi/10.1177/108835760101600203

Grindle, C. F., Hughes, J. C., Saville, M., Huxley, K., & Hastings, R. P. (2013). Teaching early reading skills to children with autism using Mimiosprout Early Reading. Behavioral Interventions, 28(3), 203–224. https://psycnet.apa.org/doi/10.1002/bin.1364

Grow, L., & LeBlanc, L. (2013). Teaching receptive language skills: Recommendations for instructors. Behavior Analysis in Practice, 6(1), 56–75. https://doi.org/10.1007/bf03391791

Halbur, M., Kodak, T., Williams, X., Reidy, J., & Halbur, C. (2021). Comparisons of sounds and words as sample stimuli for discrimination training. Journal of Applied Behavior Analysis, 54(3), 1126– 1138. https://doi.org/10.1002/jaba.830

Kangas, B. D., & Branch, M. N. (2008). Empirical validation of a procedure to correct position and stimulus bases in matching-to-sample. Journal of the Experimental Analysis of Behavior, 90(1), 103–112. https://doi.org/10.1901/jeab.2008.90-103

Kerr, N., Meyerson, L., Flora, J., Tharinger, D., Schallert, D., Casey, L., & Fehr, M. J. (1977). The measurement of motor, visual and auditory discrimination skills in mentally retarded children and adults and in young normal children. Rehabilitation Psychology, 24(3), 91–206. https://doi.org/10.1037/h0090912

Kisamore, A., Karsten, A., & Mann, C. (2016). Teaching multiply controlled intraverbals to children and adolescents with autism spectrum disorders. Journal of Applied Behavior Analysis, 49(4), 826–847. https://doi.org/10.1002/jaba.344

Kodak, T., Cariveau, T., LeBlanc, B. A., Mahon, J. J., & Carroll, R. A. (2018). Selection and implementation of skill acquisition programs by special education teachers and staff for students with autism spectrum disorder. Behavior Modification, 42(1), 58–83. https://doi.org/10.1177/0145445517692081

Kodak, T., Clements, A., Paden, A., LeBlanc, B., Mintz, J., & Toussaint, K. (2015). Examining the relation between an assessment of skills and performance on auditory-visual conditional discriminations for children with autism. Journal of Applied Behavior Analysis, 48(1), 52–70. https://doi.org/10.1002/jaba.160

LaMarca, V., & LaMarca, J. (2018). Designing receptive language programs: Pushing the boundaries of research and practice. Behavior Analysis in Practice, 11 (4), 479–495. https://doi.org/10.1007/s40617-0180208-1

Leaf, R., & McEachin, J. A. (1999). A work in progress: Behavior management strategies and a curriculum for intensive behavioral treatment of autism. DRL Books

Lovaas, O. I. (2003). Teaching individuals with developmental delays: Basic intervention techniques. PRO-ED.

Lowenkron, B. (1998). Some logical functions of joint control. Journal of the Experimental Analysis of Behavior, 69(3), 327–354. https://doi.org/10.1901/jeab.1998.69-327

Mackay, H. A. (1991). Conditional stimulus control. In I. H. Iversen & K. A. Lattal (Eds.) Techniques in the behavioral and neural sciences: Vol. 6 Experimental Analysis of Behavior (Part 1, pp. 301–350). Elsevier.

Maurice, C., Green, G., & Foxx, R. M. (2001). Making a difference: Behavioral intervention for autism. PRO-ED.

Miguel, C. F. (2016). Common and intraverbal bidirectional naming. The Analysis of Verbal Behavior, 32(2), 125–138. https://doi.org/10.1007/s40616-016-0066-2

Plavnick, J. B., Thompson, J. L., Englert, C. S., Mariage, T., & Johnson, K. (2016). Mediating access to Headsprout^®Early Reading for children with autism spectrum disorders. Journal of Behavioral Education, 25(3), 357–378. https://psycnet.apa.org/doi/10.1007/s10864-015-9244-x

Saunders K. J., & Spradlin J. E. (1989). Conditional discrimination in mentally retarded adults: The effect of training the component simple discriminations. Journal of the Experimental Analysis of Behavior, 52(1), 1– 12. https://doi.org/10.1901/jeab.1989.52-1

Saunders K. J., & Spradlin J. E. (1990). Conditional discrimination in mentally retarded adults: The development of generalized skills. Journal of the Experimental Analysis of Behavior, 54(3), 239–250. https://doi.org/10.1901/jeab.1990.54-239

Saunders K. J., & Spradlin J. E. (1993). Conditional discrimination in mentally retarded adults: Programming acquisition and learning set. Journal of the Experimental Analysis of Behavior, 60(3), 571–585. https://doi.org/10.1901/jeab.1993.60-571

Serna, R. W. (2016). Recent innovations in the assessment of auditory discrimination abilities in nonspeaking individuals with intellectual disabilities. In M. Romski & R. Sevcik (Eds.). Communication interventions for individuals with severe disabilities: Exploring research challenges & opportunities (pp. 235–258). Paul H. Brookes.

Serna, R. W., Preston, M. A., & Thompson, G. B. (2009). Assessing nonverbal same/different judgments of auditory stimuli in individuals with intellectual disabilities: A methodological investigation. Revista Brasileira de Analise do Comportamento, 5(2), 69–87. https://www.ncbi.nlm.nih.gov/pubmed/23585816

Speckman-Collins, J., Lee Park, H., & Greer, R. D. (2007). Generalized selection-based auditory matching and the emergence of the listener component of naming. Journal of Intensive and Early Behavior Intervention, 4(2), 412–429. http://dx.doi.org/https://doi.org/10.1037/h0100382

Striefel, S., Wetherby, B., & Karlan, G. R. (1976). Establishing generalized verb-noun instruction-following skills in retarded children. Journal of Experimental Child Psychology, 22(2), 247–260. https://doi.org/10.1016/0022-0965(76)90005-9

Sundberg, M. L. (2008) Verbal behavior milestones assessment and placement program: The VB-MAPP. AVB Press.

Received July 5, 2021
Final acceptance January 7, 2022
Action Editor, Regina Carroll

Supporting information
Additional Supporting Information may be found in the online version of this article at the publisher’s website.