- Voxco Online
- Voxco Panel Management
- Voxco Panel Portal
- Voxco Audience
- Voxco Mobile Offline
- Voxco Dialer Cloud
- Voxco Dialer On-premise
- Voxco TCPA Connect
- Voxco Analytics
- Voxco Text & Sentiment Analysis
- 40+ question types
- Drag-and-drop interface
- Skip logic and branching
- Multi-lingual survey
- Text piping
- Question library
- CSS customization
- White-label surveys
- Customizable ‘Thank You’ page
- Customizable survey theme
- Reminder send-outs
- Survey rewards
- Social media
- Website surveys
- Correlation analysis
- Cross-tabulation analysis
- Trend analysis
- Real-time dashboard
- Customizable report
- Email address validation
- Recaptcha validation
- SSL security
Take a peek at our powerful survey features to design surveys that scale discoveries.
Download feature sheet.
- Hospitality
- Academic Research
- Customer Experience
- Employee Experience
- Product Experience
- Market Research
- Social Research
- Data Analysis
Explore Voxco
Need to map Voxco’s features & offerings? We can help!
Watch a Demo
Download Brochures
Get a Quote
- NPS Calculator
- CES Calculator
- A/B Testing Calculator
- Margin of Error Calculator
- Sample Size Calculator
- CX Strategy & Management Hub
- Market Research Hub
- Patient Experience Hub
- Employee Experience Hub
- NPS Knowledge Hub
- Market Research Guide
- Customer Experience Guide
- Survey Research Guides
- Survey Template Library
- Webinars and Events
- Feature Sheets
- Try a sample survey
- Professional Services
Get exclusive insights into research trends and best practices from top experts! Access Voxco’s ‘State of Research Report 2024 edition’ .
We’ve been avid users of the Voxco platform now for over 20 years. It gives us the flexibility to routinely enhance our survey toolkit and provides our clients with a more robust dataset and story to tell their clients.
VP Innovation & Strategic Partnerships, The Logit Group
- Client Stories
- Voxco Reviews
- Why Voxco Research?
- Careers at Voxco
- Vulnerabilities and Ethical Hacking
Explore Regional Offices
- Survey Software The world’s leading omnichannel survey software
- Online Survey Tools Create sophisticated surveys with ease.
- Mobile Offline Conduct efficient field surveys.
- Text Analysis
- Close The Loop
- Automated Translations
- NPS Dashboard
- CATI Manage high volume phone surveys efficiently
- Cloud/On-premise Dialer TCPA compliant Cloud on-premise dialer
- IVR Survey Software Boost productivity with automated call workflows.
- Analytics Analyze survey data with visual dashboards
- Panel Manager Nurture a loyal community of respondents.
- Survey Portal Best-in-class user friendly survey portal.
- Voxco Audience Conduct targeted sample research in hours.
- Predictive Analytics
- Customer 360
- Customer Loyalty
- Fraud & Risk Management
- AI/ML Enablement Services
- Credit Underwriting
Find the best survey software for you! (Along with a checklist to compare platforms)
Get Buyer’s Guide
- 100+ question types
- SMS surveys
- Financial Services
- Banking & Financial Services
- Retail Solution
- Risk Management
- Customer Lifecycle Solutions
- Net Promoter Score
- Customer Behaviour Analytics
- Customer Segmentation
- Data Unification
Explore Voxco
Watch a Demo
Download Brochures
- CX Strategy & Management Hub
- The Voxco Guide to Customer Experience
- Professional services
- Blogs & White papers
- Case Studies
Find the best customer experience platform
Uncover customer pain points, analyze feedback and run successful CX programs with the best CX platform for your team.
Get the Guide Now
VP Innovation & Strategic Partnerships, The Logit Group
- Why Voxco Intelligence?
- Our clients
- Client stories
- Featuresheets
Pre-experimental Design: Definition, Types & Examples
- October 1, 2021
SHARE THE ARTICLE ON
Experimental research is conducted to analyze and understand the effect of a program or a treatment. There are three types of experimental research designs – pre-experimental designs, true experimental designs, and quasi-experimental designs .
In this blog, we will be talking about pre-experimental designs. Let’s first explain pre-experimental research.
What is Pre-experimental Research?
As the name suggests, pre- experimental research happens even before the true experiment starts. This is done to determine the researchers’ intervention on a group of people. This will help them tell if the investment of cost and time for conducting a true experiment is worth a while. Hence, pre-experimental research is a preliminary step to justify the presence of the researcher’s intervention.
The pre-experimental approach helps give some sort of guarantee that the experiment can be a full-scale successful study.
What is Pre-experimental Design?
The pre-experimental design includes one or more than one experimental groups to be observed against certain treatments. It is the simplest form of research design that follows the basic steps in experiments.
The pre-experimental design does not have a comparison group. This means that while a researcher can claim that participants who received certain treatment have experienced a change, they cannot conclude that the change was caused by the treatment itself.
The research design can still be useful for exploratory research to test the feasibility for further study.
Let us understand how pre-experimental design is different from the true and quasi-experiments:
The above table tells us pretty much about the working of the pre-experimental designs. So we can say that it is actually to test treatment, and check whether it has the potential to cause a change or not. For the same reasons, it is advised to perform pre-experiments to define the potential of a true experiment.
See Voxco survey software in action with a Free demo.
Types of Pre-experimental Designs
Assuming now you have a better understanding of what the whole pre-experimental design concept is, it is time to move forward and look at its types and their working:
One-shot case study design
- This design practices the treatment of a single group.
- It only takes a single measurement after the experiment.
- A one-shot case study design only analyses post-test results.
The one-shot case study compares the post-test results to the expected results. It makes clear what the result is and how the case would have looked if the treatment wasn’t done.
A team leader wants to implement a new soft skills program in the firm. The employees can be measured at the end of the first month to see the improvement in their soft skills. The team leader will know the impact of the program on the employees.
One-group pretest-posttest design
- Like the previous one, this design also works on just one experimental group.
- But this one takes two measures into account.
- A pre-test and a post-test are conducted.
As the name suggests, it includes one group and conducts pre-test and post-test on it. The pre-test will tell how the group was before they were put under treatment. Whereas post-test determines the changes in the group after the treatment.
This sounds like a true experiment , but being a pre-experiment design, it does not have any control group.
Following the previous example, the team leader here will conduct two tests. One before the soft skill program implementation to know the level of employees before they were put through the training. And a post-test to know their status after the training.
Now that he has a frame of reference, he knows exactly how the program helped the employees.
Static-group comparison
- This compares two experimental groups.
- One group is exposed to the treatment.
- The other group is not exposed to the treatment.
- The difference between the two groups is the result of the experiment.
As the name suggests, it has two groups, which means it involves a control group too.
In static-group comparison design, the two groups are observed as one goes through the treatment while the other does not. They are then compared to each other to determine the outcome of the treatment.
The team lead decides one group of employees to get the soft skills training while the other group remains as a control group and is not exposed to any program. He then compares both the groups and finds out the treatment group has evolved in their soft skills more than the control group.
Due to such working, static-group comparison design is generally perceived as a quasi-experimental design too.
Characteristics of Pre-experimental Designs
In this section, let us point down the characteristics of pre-experimental design:
- Generally uses only one group for treatment which makes observation simple and easy.
- Validates the experiment in the preliminary phase itself.
- Pre-experimental design tells the researchers how their intervention will affect the whole study.
- As they are conducted in the beginning, pre-experimental designs give evidence for or against their intervention.
- It does not involve the randomization of the participants.
- It generally does not involve the control group, but in some cases where there is a need for studying the control group against the treatment group, static-group comparison comes into the picture.
- The pre-experimental design gives an idea about how the treatment is going to work in case of actual true experiments.
Validity of results in Pre-experimental Designs
Validity means a level to which data or results reflect the accuracy of reality. And in the case of pre-experimental research design, it is a tough catch. The reason being testing a hypothesis or dissolving a problem can be quite a difficult task, let’s say close to impossible. This being said, researchers find it challenging to generalize the results they got from the pre-experimental design, over the actual experiment.
As pre-experimental design generally does not have any comparison groups to compete for the results with, that makes it pretty obvious for the researchers to go through the trouble of believing its results. Without comparison, it is hard to tell how significant or valid the result is. Because there is a chance that the result occurred due to some uncalled changes in the treatment, maturation of the group, or is it just sheer chance.
Let’s say all the above parameters work just in favor of your experiment, you even have a control group to compare it with, but that still leaves us with one problem. And that is what “kind” of groups we get for the true experiments. It is possible that the subjects in your pre-experimental design were a lot different from the subjects you have for the true experiment. If this is the case, even if your treatment is constant, there is still going to be a change in your results.
Advantages of Pre-experimental Designs
- Cost-effective due to its easy process.
- Very simple to conduct.
- Efficient to conduct in the natural environment.
- It is also suitable for beginners.
- Involves less human intervention.
- Determines how your treatment is going to affect the true experiment.
Disadvantages of Pre-experimental Designs
- It is a weak design to determine causal relationships between variables.
- Does not have any control over the research.
- Possess a high threat to internal validity.
- Researchers find it tough to examine the results’ integrity.
- The absence of a control group makes the results less reliable.
This sums up the basics of pre-experimental design and how it differs from other experimental research designs. Curious to learn how you can use survey software to conduct your experimental research, book a meeting with us .
Pre-experimental design is a research method that happens before the true experiment and determines how the researcher’s intervention will affect the experiment.
An example of a pre-experimental design would be a gym trainer implementing a new training schedule for a trainee.
Characteristics of pre-experimental design include its ability to determine the significance of treatment even before the true experiment is performed.
Researchers want to know how their intervention is going to affect the experiment. So even before the true experiment starts, they carry out a pre-experimental research design to determine the possible results of the true experiment.
The pre-experimental design deals with the treatment’s effect on the experiment and is carried out even before the true experiment takes place. While a true experiment is an actual experiment, it is important to conduct its pre-experiment first to see how the intervention is going to affect the experiment.
The true experimental design carries out the pre-test and post-test on both the treatment group as well as a control group. whereas in pre-experimental design, control group and pre-test are options. it does not always have the presence of those two and helps the researcher determine how the real experiment is going to happen.
The main difference between a pre-experimental design and a quasi-experimental design is that pre-experimental design does not use control groups and quasi-experimental design does. Quasi always makes use of the pre-test post-test model of result comparison while pre-experimental design mostly doesn’t.
Non-experimental research methods majorly fall into three categories namely: Cross-sectional research, correlational research and observational research.
Explore Voxco Survey Software
+ Omnichannel Survey Software
+ Online Survey Software
+ CATI Survey Software
+ IVR Survey Software
+ Market Research Tool
+ Customer Experience Tool
+ Product Experience Software
+ Enterprise Survey Software
Social Research : A complete guide
Social Research : A complete guide SHARE THE ARTICLE ON Share on facebook Share on twitter Share on linkedin Table of Contents What is social
Demystifying election polling: How survey data can give you insights
A Comprehensive Guide to Polling Survey Platform SHARE THE ARTICLE ON Table of Contents Feedback matters, and so do opinions, especially when they can impact
Predictive Analysis
Predictive Analysis SHARE THE ARTICLE ON Table of Contents Predictive analytics forecasts future occurrences based on previous data. Typically, historical data is utilized to construct
Paired sample t-test
Paired sample t-test SHARE THE ARTICLE ON Share on facebook Share on twitter Share on linkedin Table of Contents What is a paired sample t-test?
The benefits of using panel management tools for your research
The Benefits of Using Panel Management Tools for Your Research SHARE THE ARTICLE ON Table of Contents Obtaining high quality is the foundation of any
Survey Vs Focus Group
Survey Vs Focus Group Voxco is trusted by 450+ Global Brands in 40+ countries See what question types are possible with a sample survey! Try
We use cookies in our website to give you the best browsing experience and to tailor advertising. By continuing to use our website, you give us consent to the use of cookies. Read More
Name | Domain | Purpose | Expiry | Type |
---|---|---|---|---|
hubspotutk | www.voxco.com | HubSpot functional cookie. | 1 year | HTTP |
lhc_dir_locale | amplifyreach.com | --- | 52 years | --- |
lhc_dirclass | amplifyreach.com | --- | 52 years | --- |
Name | Domain | Purpose | Expiry | Type |
---|---|---|---|---|
_fbp | www.voxco.com | Facebook Pixel advertising first-party cookie | 3 months | HTTP |
__hstc | www.voxco.com | Hubspot marketing platform cookie. | 1 year | HTTP |
__hssrc | www.voxco.com | Hubspot marketing platform cookie. | 52 years | HTTP |
__hssc | www.voxco.com | Hubspot marketing platform cookie. | Session | HTTP |
Name | Domain | Purpose | Expiry | Type |
---|---|---|---|---|
_gid | www.voxco.com | Google Universal Analytics short-time unique user tracking identifier. | 1 days | HTTP |
MUID | bing.com | Microsoft User Identifier tracking cookie used by Bing Ads. | 1 year | HTTP |
MR | bat.bing.com | Microsoft User Identifier tracking cookie used by Bing Ads. | 7 days | HTTP |
IDE | doubleclick.net | Google advertising cookie used for user tracking and ad targeting purposes. | 2 years | HTTP |
_vwo_uuid_v2 | www.voxco.com | Generic Visual Website Optimizer (VWO) user tracking cookie. | 1 year | HTTP |
_vis_opt_s | www.voxco.com | Generic Visual Website Optimizer (VWO) user tracking cookie that detects if the user is new or returning to a particular campaign. | 3 months | HTTP |
_vis_opt_test_cookie | www.voxco.com | A session (temporary) cookie used by Generic Visual Website Optimizer (VWO) to detect if the cookies are enabled on the browser of the user or not. | 52 years | HTTP |
_ga | www.voxco.com | Google Universal Analytics long-time unique user tracking identifier. | 2 years | HTTP |
_uetsid | www.voxco.com | Microsoft Bing Ads Universal Event Tracking (UET) tracking cookie. | 1 days | HTTP |
vuid | vimeo.com | Vimeo tracking cookie | 2 years | HTTP |
Name | Domain | Purpose | Expiry | Type |
---|---|---|---|---|
__cf_bm | hubspot.com | Generic CloudFlare functional cookie. | Session | HTTP |
Name | Domain | Purpose | Expiry | Type |
---|---|---|---|---|
_gcl_au | www.voxco.com | --- | 3 months | --- |
_gat_gtag_UA_3262734_1 | www.voxco.com | --- | Session | --- |
_clck | www.voxco.com | --- | 1 year | --- |
_ga_HNFQQ528PZ | www.voxco.com | --- | 2 years | --- |
_clsk | www.voxco.com | --- | 1 days | --- |
visitor_id18452 | pardot.com | --- | 10 years | --- |
visitor_id18452-hash | pardot.com | --- | 10 years | --- |
lpv18452 | pi.pardot.com | --- | Session | --- |
lhc_per | www.voxco.com | --- | 6 months | --- |
_uetvid | www.voxco.com | --- | 1 year | --- |
Child Care and Early Education Research Connections
Pre-experimental designs.
Pre-experiments are the simplest form of research design. In a pre-experiment either a single group or multiple groups are observed subsequent to some agent or treatment presumed to cause change.
Types of Pre-Experimental Design
One-shot case study design, one-group pretest-posttest design, static-group comparison.
A single group is studied at a single point in time after some treatment that is presumed to have caused change. The carefully studied single instance is compared to general expectations of what the case would have looked like had the treatment not occurred and to other events casually observed. No control or comparison group is employed.
A single case is observed at two time points, one before the treatment and one after the treatment. Changes in the outcome of interest are presumed to be the result of the intervention or treatment. No control or comparison group is employed.
A group that has experienced some treatment is compared with one that has not. Observed differences between the two groups are assumed to be a result of the treatment.
Validity of Results
An important drawback of pre-experimental designs is that they are subject to numerous threats to their validity . Consequently, it is often difficult or impossible to dismiss rival hypotheses or explanations. Therefore, researchers must exercise extreme caution in interpreting and generalizing the results from pre-experimental studies.
One reason that it is often difficult to assess the validity of studies that employ a pre-experimental design is that they often do not include any control or comparison group. Without something to compare it to, it is difficult to assess the significance of an observed change in the case. The change could be the result of historical changes unrelated to the treatment, the maturation of the subject, or an artifact of the testing.
Even when pre-experimental designs identify a comparison group, it is still difficult to dismiss rival hypotheses for the observed change. This is because there is no formal way to determine whether the two groups would have been the same if it had not been for the treatment. If the treatment group and the comparison group differ after the treatment, this might be a reflection of differences in the initial recruitment to the groups or differential mortality in the experiment.
Advantages and Disadvantages
As exploratory approaches, pre-experiments can be a cost-effective way to discern whether a potential explanation is worthy of further investigation.
Disadvantages
Pre-experiments offer few advantages since it is often difficult or impossible to rule out alternative explanations. The nearly insurmountable threats to their validity are clearly the most important disadvantage of pre-experimental research designs.
- Foundations
- Write Paper
Search form
- Experiments
- Anthropology
- Self-Esteem
- Social Anxiety
Pretest-Posttest Designs
For many true experimental designs , pretest-posttest designs are the preferred method to compare participant groups and measure the degree of change occurring as a result of treatments or interventions.
This article is a part of the guide:
- Experimental Research
- Third Variable
- Research Bias
- Independent Variable
- Between Subjects
Browse Full Outline
- 1 Experimental Research
- 2.1 Independent Variable
- 2.2 Dependent Variable
- 2.3 Controlled Variables
- 2.4 Third Variable
- 3.1 Control Group
- 3.2 Research Bias
- 3.3.1 Placebo Effect
- 3.3.2 Double Blind Method
- 4.1 Randomized Controlled Trials
- 4.2 Pretest-Posttest
- 4.3 Solomon Four Group
- 4.4 Between Subjects
- 4.5 Within Subject
- 4.6 Repeated Measures
- 4.7 Counterbalanced Measures
- 4.8 Matched Subjects
Pretest-posttest designs grew from the simpler posttest only designs, and address some of the issues arising with assignment bias and the allocation of participants to groups.
One example is education, where researchers want to monitor the effect of a new teaching method upon groups of children. Other areas include evaluating the effects of counseling, testing medical treatments, and measuring psychological constructs. The only stipulation is that the subjects must be randomly assigned to groups, in a true experimental design, to properly isolate and nullify any nuisance or confounding variables .
The Posttest Only Design With Non-Equivalent Control Groups
Pretest-posttest designs are an expansion of the posttest only design with nonequivalent groups, one of the simplest methods of testing the effectiveness of an intervention.
In this design, which uses two groups, one group is given the treatment and the results are gathered at the end. The control group receives no treatment, over the same period of time, but undergoes exactly the same tests.
Statistical analysis can then determine if the intervention had a significant effect . One common example of this is in medicine; one group is given a medicine, whereas the control group is given none, and this allows the researchers to determine if the drug really works. This type of design, whilst commonly using two groups, can be slightly more complex. For example, if different dosages of a medicine are tested, the design can be based around multiple groups.
Whilst this posttest only design does find many uses, it is limited in scope and contains many threats to validity . It is very poor at guarding against assignment bias , because the researcher knows nothing about the individual differences within the control group and how they may have affected the outcome. Even with randomization of the initial groups, this failure to address assignment bias means that the statistical power is weak.
The results of such a study will always be limited in scope and, resources permitting; most researchers use a more robust design, of which pretest-posttest designs are one. The posttest only design with non-equivalent groups is usually reserved for experiments performed after the fact, such as a medical researcher wishing to observe the effect of a medicine that has already been administered.
The Two Group Control Group Design
This is, by far, the simplest and most common of the pretest-posttest designs, and is a useful way of ensuring that an experiment has a strong level of internal validity . The principle behind this design is relatively simple, and involves randomly assigning subjects between two groups, a test group and a control . Both groups are pre-tested, and both are post-tested, the ultimate difference being that one group was administered the treatment.
This test allows a number of distinct analyses, giving researchers the tools to filter out experimental noise and confounding variables . The internal validity of this design is strong, because the pretest ensures that the groups are equivalent. The various analyses that can be performed upon a two-group control group pretest-posttest designs are (Fig 1):
- This design allows researchers to compare the final posttest results between the two groups, giving them an idea of the overall effectiveness of the intervention or treatment. (C)
- The researcher can see how both groups changed from pretest to posttest, whether one, both or neither improved over time. If the control group also showed a significant improvement, then the researcher must attempt to uncover the reasons behind this. (A and A1)
- The researchers can compare the scores in the two pretest groups, to ensure that the randomization process was effective. (B)
These checks evaluate the efficiency of the randomization process and also determine whether the group given the treatment showed a significant difference.
Problems With Pretest-Posttest Designs
The main problem with this design is that it improves internal validity but sacrifices external validity to do so. There is no way of judging whether the process of pre-testing actually influenced the results because there is no baseline measurement against groups that remained completely untreated. For example, children given an educational pretest may be inspired to try a little harder in their lessons, and both groups would outperform children not given a pretest, so it becomes difficult to generalize the results to encompass all children.
The other major problem, which afflicts many sociological and educational research programs, is that it is impossible and unethical to isolate all of the participants completely. If two groups of children attend the same school, it is reasonable to assume that they mix outside of lessons and share ideas, potentially contaminating the results. On the other hand, if the children are drawn from different schools to prevent this, the chance of selection bias arises, because randomization is not possible.
The two-group control group design is an exceptionally useful research method, as long as its limitations are fully understood. For extensive and particularly important research, many researchers use the Solomon four group method , a design that is more costly, but avoids many weaknesses of the simple pretest-posttest designs.
- Psychology 101
- Flags and Countries
- Capitals and Countries
Martyn Shuttleworth (Nov 3, 2009). Pretest-Posttest Designs. Retrieved Sep 26, 2024 from Explorable.com: https://explorable.com/pretest-posttest-designs
You Are Allowed To Copy The Text
The text in this article is licensed under the Creative Commons-License Attribution 4.0 International (CC BY 4.0) .
This means you're free to copy, share and adapt any parts (or all) of the text in the article, as long as you give appropriate credit and provide a link/reference to this page.
That is it. You don't need our permission to copy the article; just include a link/reference back to this page. You can use it freely (with some kind of link), and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations (with clear attribution).
Want to stay up to date? Follow us!
Get all these articles in 1 guide.
Want the full version to study at home, take to school or just scribble on?
Whether you are an academic novice, or you simply want to brush up your skills, this book will take your academic writing skills to the next level.
Download electronic versions: - Epub for mobiles and tablets - For Kindle here - For iBooks here - PDF version here
Save this course for later
Don't have time for it all now? No problem, save it as a course and come back to it later.
Footer bottom
- Privacy Policy
- Subscribe to our RSS Feed
- Like us on Facebook
- Follow us on Twitter
Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.
30 8.1 Experimental design: What is it and when should it be used?
Learning objectives.
- Define experiment
- Identify the core features of true experimental designs
- Describe the difference between an experimental group and a control group
- Identify and describe the various types of true experimental designs
Experiments are an excellent data collection strategy for social workers wishing to observe the effects of a clinical intervention or social welfare program. Understanding what experiments are and how they are conducted is useful for all social scientists, whether they actually plan to use this methodology or simply aim to understand findings from experimental studies. An experiment is a method of data collection designed to test hypotheses under controlled conditions. In social scientific research, the term experiment has a precise meaning and should not be used to describe all research methodologies.
Experiments have a long and important history in social science. Behaviorists such as John Watson, B. F. Skinner, Ivan Pavlov, and Albert Bandura used experimental design to demonstrate the various types of conditioning. Using strictly controlled environments, behaviorists were able to isolate a single stimulus as the cause of measurable differences in behavior or physiological responses. The foundations of social learning theory and behavior modification are found in experimental research projects. Moreover, behaviorist experiments brought psychology and social science away from the abstract world of Freudian analysis and towards empirical inquiry, grounded in real-world observations and objectively-defined variables. Experiments are used at all levels of social work inquiry, including agency-based experiments that test therapeutic interventions and policy experiments that test new programs.
Several kinds of experimental designs exist. In general, designs considered to be true experiments contain three basic key features:
- random assignment of participants into experimental and control groups
- a “treatment” (or intervention) provided to the experimental group
- measurement of the effects of the treatment in a post-test administered to both groups
Some true experiments are more complex. Their designs can also include a pre-test and can have more than two groups, but these are the minimum requirements for a design to be a true experiment.
Experimental and control groups
In a true experiment, the effect of an intervention is tested by comparing two groups: one that is exposed to the intervention (the experimental group , also known as the treatment group) and another that does not receive the intervention (the control group ). Importantly, participants in a true experiment need to be randomly assigned to either the control or experimental groups. Random assignment uses a random number generator or some other random process to assign people into experimental and control groups. Random assignment is important in experimental research because it helps to ensure that the experimental group and control group are comparable and that any differences between the experimental and control groups are due to random chance. We will address more of the logic behind random assignment in the next section.
Treatment or intervention
In an experiment, the independent variable is receiving the intervention being tested—for example, a therapeutic technique, prevention program, or access to some service or support. It is less common in of social work research, but social science research may also have a stimulus, rather than an intervention as the independent variable. For example, an electric shock or a reading about death might be used as a stimulus to provoke a response.
In some cases, it may be immoral to withhold treatment completely from a control group within an experiment. If you recruited two groups of people with severe addiction and only provided treatment to one group, the other group would likely suffer. For these cases, researchers use a control group that receives “treatment as usual.” Experimenters must clearly define what treatment as usual means. For example, a standard treatment in substance abuse recovery is attending Alcoholics Anonymous or Narcotics Anonymous meetings. A substance abuse researcher conducting an experiment may use twelve-step programs in their control group and use their experimental intervention in the experimental group. The results would show whether the experimental intervention worked better than normal treatment, which is useful information.
The dependent variable is usually the intended effect the researcher wants the intervention to have. If the researcher is testing a new therapy for individuals with binge eating disorder, their dependent variable may be the number of binge eating episodes a participant reports. The researcher likely expects her intervention to decrease the number of binge eating episodes reported by participants. Thus, she must, at a minimum, measure the number of episodes that occur after the intervention, which is the post-test . In a classic experimental design, participants are also given a pretest to measure the dependent variable before the experimental treatment begins.
Types of experimental design
Let’s put these concepts in chronological order so we can better understand how an experiment runs from start to finish. Once you’ve collected your sample, you’ll need to randomly assign your participants to the experimental group and control group. In a common type of experimental design, you will then give both groups your pretest, which measures your dependent variable, to see what your participants are like before you start your intervention. Next, you will provide your intervention, or independent variable, to your experimental group, but not to your control group. Many interventions last a few weeks or months to complete, particularly therapeutic treatments. Finally, you will administer your post-test to both groups to observe any changes in your dependent variable. What we’ve just described is known as the classical experimental design and is the simplest type of true experimental design. All of the designs we review in this section are variations on this approach. Figure 8.1 visually represents these steps.
An interesting example of experimental research can be found in Shannon K. McCoy and Brenda Major’s (2003) study of people’s perceptions of prejudice. In one portion of this multifaceted study, all participants were given a pretest to assess their levels of depression. No significant differences in depression were found between the experimental and control groups during the pretest. Participants in the experimental group were then asked to read an article suggesting that prejudice against their own racial group is severe and pervasive, while participants in the control group were asked to read an article suggesting that prejudice against a racial group other than their own is severe and pervasive. Clearly, these were not meant to be interventions or treatments to help depression, but were stimuli designed to elicit changes in people’s depression levels. Upon measuring depression scores during the post-test period, the researchers discovered that those who had received the experimental stimulus (the article citing prejudice against their same racial group) reported greater depression than those in the control group. This is just one of many examples of social scientific experimental research.
In addition to classic experimental design, there are two other ways of designing experiments that are considered to fall within the purview of “true” experiments (Babbie, 2010; Campbell & Stanley, 1963). The posttest-only control group design is almost the same as classic experimental design, except it does not use a pretest. Researchers who use posttest-only designs want to eliminate testing effects , in which participants’ scores on a measure change because they have already been exposed to it. If you took multiple SAT or ACT practice exams before you took the real one you sent to colleges, you’ve taken advantage of testing effects to get a better score. Considering the previous example on racism and depression, participants who are given a pretest about depression before being exposed to the stimulus would likely assume that the intervention is designed to address depression. That knowledge could cause them to answer differently on the post-test than they otherwise would. In theory, as long as the control and experimental groups have been determined randomly and are therefore comparable, no pretest is needed. However, most researchers prefer to use pretests in case randomization did not result in equivalent groups and to help assess change over time within both the experimental and control groups.
Researchers wishing to account for testing effects but also gather pretest data can use a Solomon four-group design. In the Solomon four-group design , the researcher uses four groups. Two groups are treated as they would be in a classic experiment—pretest, experimental group intervention, and post-test. The other two groups do not receive the pretest, though one receives the intervention. All groups are given the post-test. Table 8.1 illustrates the features of each of the four groups in the Solomon four-group design. By having one set of experimental and control groups that complete the pretest (Groups 1 and 2) and another set that does not complete the pretest (Groups 3 and 4), researchers using the Solomon four-group design can account for testing effects in their analysis.
Group 1 | X | X | X |
Group 2 | X | X | |
Group 3 | X | X | |
Group 4 | X |
Solomon four-group designs are challenging to implement in the real world because they are time- and resource-intensive. Researchers must recruit enough participants to create four groups and implement interventions in two of them.
Overall, true experimental designs are sometimes difficult to implement in a real-world practice environment. It may be impossible to withhold treatment from a control group or randomly assign participants in a study. In these cases, pre-experimental and quasi-experimental designs–which we will discuss in the next section–can be used. However, the differences in rigor from true experimental designs leave their conclusions more open to critique.
Experimental design in macro-level research
You can imagine that social work researchers may be limited in their ability to use random assignment when examining the effects of governmental policy on individuals. For example, it is unlikely that a researcher could randomly assign some states to implement decriminalization of recreational marijuana and some states not to in order to assess the effects of the policy change. There are, however, important examples of policy experiments that use random assignment, including the Oregon Medicaid experiment. In the Oregon Medicaid experiment, the wait list for Oregon was so long, state officials conducted a lottery to see who from the wait list would receive Medicaid (Baicker et al., 2013). Researchers used the lottery as a natural experiment that included random assignment. People selected to be a part of Medicaid were the experimental group and those on the wait list were in the control group. There are some practical complications macro-level experiments, just as with other experiments. For example, the ethical concern with using people on a wait list as a control group exists in macro-level research just as it does in micro-level research.
Key Takeaways
- True experimental designs require random assignment.
- Control groups do not receive an intervention, and experimental groups receive an intervention.
- The basic components of a true experiment include a pretest, posttest, control group, and experimental group.
- Testing effects may cause researchers to use variations on the classic experimental design.
- Classic experimental design- uses random assignment, an experimental and control group, as well as pre- and posttesting
- Control group- the group in an experiment that does not receive the intervention
- Experiment- a method of data collection designed to test hypotheses under controlled conditions
- Experimental group- the group in an experiment that receives the intervention
- Posttest- a measurement taken after the intervention
- Posttest-only control group design- a type of experimental design that uses random assignment, and an experimental and control group, but does not use a pretest
- Pretest- a measurement taken prior to the intervention
- Random assignment-using a random process to assign people into experimental and control groups
- Solomon four-group design- uses random assignment, two experimental and two control groups, pretests for half of the groups, and posttests for all
- Testing effects- when a participant’s scores on a measure change because they have already been exposed to it
- True experiments- a group of experimental designs that contain independent and dependent variables, pretesting and post testing, and experimental and control groups
Image attributions
exam scientific experiment by mohamed_hassan CC-0
Foundations of Social Work Research Copyright © 2020 by Rebecca L. Mauldin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.
Share This Book
Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.
8.2 Quasi-experimental and pre-experimental designs
Learning objectives.
- Identify and describe the various types of quasi-experimental designs
- Distinguish true experimental designs from quasi-experimental and pre-experimental designs
- Identify and describe the various types of quasi-experimental and pre-experimental designs
As we discussed in the previous section, time, funding, and ethics may limit a researcher’s ability to conduct a true experiment. For researchers in the medical sciences and social work, conducting a true experiment could require denying needed treatment to clients, which is a clear ethical violation. Even those whose research may not involve the administration of needed medications or treatments may be limited in their ability to conduct a classic experiment. When true experiments are not possible, researchers often use quasi-experimental designs.
Quasi-experimental designs
Quasi-experimental designs are similar to true experiments, but they lack random assignment to experimental and control groups. Quasi-experimental designs have a comparison group that is similar to a control group except assignment to the comparison group is not determined by random assignment. The most basic of these quasi-experimental designs is the nonequivalent comparison groups design (Rubin & Babbie, 2017). The nonequivalent comparison group design looks a lot like the classic experimental design, except it does not use random assignment. In many cases, these groups may already exist. For example, a researcher might conduct research at two different agency sites, one of which receives the intervention and the other does not. No one was assigned to treatment or comparison groups. Those groupings existed prior to the study. While this method is more convenient for real-world research, it is less likely that that the groups are comparable than if they had been determined by random assignment. Perhaps the treatment group has a characteristic that is unique–for example, higher income or different diagnoses–that make the treatment more effective.
Quasi-experiments are particularly useful in social welfare policy research. Social welfare policy researchers often look for what are termed natural experiments , or situations in which comparable groups are created by differences that already occur in the real world. Natural experiments are a feature of the social world that allows researchers to use the logic of experimental design to investigate the connection between variables. For example, Stratmann and Wille (2016) were interested in the effects of a state healthcare policy called Certificate of Need on the quality of hospitals. They clearly could not randomly assign states to adopt one set of policies or another. Instead, researchers used hospital referral regions, or the areas from which hospitals draw their patients, that spanned across state lines. Because the hospitals were in the same referral region, researchers could be pretty sure that the client characteristics were pretty similar. In this way, they could classify patients in experimental and comparison groups without dictating state policy or telling people where to live.
Matching is another approach in quasi-experimental design for assigning people to experimental and comparison groups. It begins with researchers thinking about what variables are important in their study, particularly demographic variables or attributes that might impact their dependent variable. Individual matching involves pairing participants with similar attributes. Then, the matched pair is split—with one participant going to the experimental group and the other to the comparison group. An ex post facto control group , in contrast, is when a researcher matches individuals after the intervention is administered to some participants. Finally, researchers may engage in aggregate matching , in which the comparison group is determined to be similar on important variables.
Time series design
There are many different quasi-experimental designs in addition to the nonequivalent comparison group design described earlier. Describing all of them is beyond the scope of this textbook, but one more design is worth mentioning. The time series design uses multiple observations before and after an intervention. In some cases, experimental and comparison groups are used. In other cases where that is not feasible, a single experimental group is used. By using multiple observations before and after the intervention, the researcher can better understand the true value of the dependent variable in each participant before the intervention starts. Additionally, multiple observations afterwards allow the researcher to see whether the intervention had lasting effects on participants. Time series designs are similar to single-subjects designs, which we will discuss in Chapter 15.
Pre-experimental design
When true experiments and quasi-experiments are not possible, researchers may turn to a pre-experimental design (Campbell & Stanley, 1963). Pre-experimental designs are called such because they often happen as a pre-cursor to conducting a true experiment. Researchers want to see if their interventions will have some effect on a small group of people before they seek funding and dedicate time to conduct a true experiment. Pre-experimental designs, thus, are usually conducted as a first step towards establishing the evidence for or against an intervention. However, this type of design comes with some unique disadvantages, which we’ll describe below.
A commonly used type of pre-experiment is the one-group pretest post-test design . In this design, pre- and posttests are both administered, but there is no comparison group to which to compare the experimental group. Researchers may be able to make the claim that participants receiving the treatment experienced a change in the dependent variable, but they cannot begin to claim that the change was the result of the treatment without a comparison group. Imagine if the students in your research class completed a questionnaire about their level of stress at the beginning of the semester. Then your professor taught you mindfulness techniques throughout the semester. At the end of the semester, she administers the stress survey again. What if levels of stress went up? Could she conclude that the mindfulness techniques caused stress? Not without a comparison group! If there was a comparison group, she would be able to recognize that all students experienced higher stress at the end of the semester than the beginning of the semester, not just the students in her research class.
In cases where the administration of a pretest is cost prohibitive or otherwise not possible, a one- shot case study design might be used. In this instance, no pretest is administered, nor is a comparison group present. If we wished to measure the impact of a natural disaster, such as Hurricane Katrina for example, we might conduct a pre-experiment by identifying a community that was hit by the hurricane and then measuring the levels of stress in the community. Researchers using this design must be extremely cautious about making claims regarding the effect of the treatment or stimulus. They have no idea what the levels of stress in the community were before the hurricane hit nor can they compare the stress levels to a community that was not affected by the hurricane. Nonetheless, this design can be useful for exploratory studies aimed at testing a measures or the feasibility of further study.
In our example of the study of the impact of Hurricane Katrina, a researcher might choose to examine the effects of the hurricane by identifying a group from a community that experienced the hurricane and a comparison group from a similar community that had not been hit by the hurricane. This study design, called a static group comparison , has the advantage of including a comparison group that did not experience the stimulus (in this case, the hurricane). Unfortunately, the design only uses for post-tests, so it is not possible to know if the groups were comparable before the stimulus or intervention. As you might have guessed from our example, static group comparisons are useful in cases where a researcher cannot control or predict whether, when, or how the stimulus is administered, as in the case of natural disasters.
As implied by the preceding examples where we considered studying the impact of Hurricane Katrina, experiments, quasi-experiments, and pre-experiments do not necessarily need to take place in the controlled setting of a lab. In fact, many applied researchers rely on experiments to assess the impact and effectiveness of various programs and policies. You might recall our discussion of arresting perpetrators of domestic violence in Chapter 2, which is an excellent example of an applied experiment. Researchers did not subject participants to conditions in a lab setting; instead, they applied their stimulus (in this case, arrest) to some subjects in the field and they also had a control group in the field that did not receive the stimulus (and therefore were not arrested).
Key Takeaways
- Quasi-experimental designs do not use random assignment.
- Comparison groups are used in quasi-experiments.
- Matching is a way of improving the comparability of experimental and comparison groups.
- Quasi-experimental designs and pre-experimental designs are often used when experimental designs are impractical.
- Quasi-experimental and pre-experimental designs may be easier to carry out, but they lack the rigor of true experiments.
- Aggregate matching – when the comparison group is determined to be similar to the experimental group along important variables
- Comparison group – a group in quasi-experimental design that does not receive the experimental treatment; it is similar to a control group except assignment to the comparison group is not determined by random assignment
- Ex post facto control group – a control group created when a researcher matches individuals after the intervention is administered
- Individual matching – pairing participants with similar attributes for the purpose of assignment to groups
- Natural experiments – situations in which comparable groups are created by differences that already occur in the real world
- Nonequivalent comparison group design – a quasi-experimental design similar to a classic experimental design but without random assignment
- One-group pretest post-test design – a pre-experimental design that applies an intervention to one group but also includes a pretest
- One-shot case study – a pre-experimental design that applies an intervention to only one group without a pretest
- Pre-experimental designs – a variation of experimental design that lacks the rigor of experiments and is often used before a true experiment is conducted
- Quasi-experimental design – designs lack random assignment to experimental and control groups
- Static group design – uses an experimental group and a comparison group, without random assignment and pretesting
- Time series design – a quasi-experimental design that uses multiple observations before and after an intervention
Image attributions
cat and kitten matching avocado costumes on the couch looking at the camera by Your Best Digs CC-BY-2.0
Foundations of Social Work Research Copyright © 2020 by Rebecca L. Mauldin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.
Share This Book
For the best experience, please use Chrome, Firefox, Safari, or Edge.
Pre-Experimental Designs
Pre-experiments are the simplest form of research design. In a pre-experiment either a single group or multiple groups are observed subsequent to some agent or treatment presumed to cause change.
Types of Pre-Experimental Design
One-shot case study design, one-group pretest-posttest design, static-group comparison.
A single group is studied at a single point in time after some treatment that is presumed to have caused change. The carefully studied single instance is compared to general expectations of what the case would have looked like had the treatment not occurred and to other events casually observed. No control or comparison group is employed.
A single case is observed at two time points, one before the treatment and one after the treatment. Changes in the outcome of interest are presumed to be the result of the intervention or treatment. No control or comparison group is employed.
A group that has experienced some treatment is compared with one that has not. Observed differences between the two groups are assumed to be a result of the treatment.
Validity of Results
An important drawback of pre-experimental designs is that they are subject to numerous threats to their validity . Consequently, it is often difficult or impossible to dismiss rival hypotheses or explanations. Therefore, researchers must exercise extreme caution in interpreting and generalizing the results from pre-experimental studies.
One reason that it is often difficult to assess the validity of studies that employ a pre-experimental design is that they often do not include any control or comparison group. Without something to compare it to, it is difficult to assess the significance of an observed change in the case. The change could be the result of historical changes unrelated to the treatment, the maturation of the subject, or an artifact of the testing.
Even when pre-experimental designs identify a comparison group, it is still difficult to dismiss rival hypotheses for the observed change. This is because there is no formal way to determine whether the two groups would have been the same if it had not been for the treatment. If the treatment group and the comparison group differ after the treatment, this might be a reflection of differences in the initial recruitment to the groups or differential mortality in the experiment.
Advantages and Disadvantages
As exploratory approaches, pre-experiments can be a cost-effective way to discern whether a potential explanation is worthy of further investigation.
Disadvantages
Pre-experiments offer few advantages since it is often difficult or impossible to rule out alternative explanations. The nearly insurmountable threats to their validity are clearly the most important disadvantage of pre-experimental research designs.
- Search Menu
Sign in through your institution
- Browse content in Arts and Humanities
- Browse content in Archaeology
- Anglo-Saxon and Medieval Archaeology
- Archaeological Methodology and Techniques
- Archaeology by Region
- Archaeology of Religion
- Archaeology of Trade and Exchange
- Biblical Archaeology
- Contemporary and Public Archaeology
- Environmental Archaeology
- Historical Archaeology
- History and Theory of Archaeology
- Industrial Archaeology
- Landscape Archaeology
- Mortuary Archaeology
- Prehistoric Archaeology
- Underwater Archaeology
- Urban Archaeology
- Zooarchaeology
- Browse content in Architecture
- Architectural Structure and Design
- History of Architecture
- Residential and Domestic Buildings
- Theory of Architecture
- Browse content in Art
- Art Subjects and Themes
- History of Art
- Industrial and Commercial Art
- Theory of Art
- Biographical Studies
- Byzantine Studies
- Browse content in Classical Studies
- Classical Numismatics
- Classical Literature
- Classical Reception
- Classical History
- Classical Philosophy
- Classical Mythology
- Classical Art and Architecture
- Classical Oratory and Rhetoric
- Greek and Roman Archaeology
- Greek and Roman Papyrology
- Greek and Roman Epigraphy
- Greek and Roman Law
- Late Antiquity
- Religion in the Ancient World
- Social History
- Digital Humanities
- Browse content in History
- Colonialism and Imperialism
- Diplomatic History
- Environmental History
- Genealogy, Heraldry, Names, and Honours
- Genocide and Ethnic Cleansing
- Historical Geography
- History by Period
- History of Agriculture
- History of Education
- History of Emotions
- History of Gender and Sexuality
- Industrial History
- Intellectual History
- International History
- Labour History
- Legal and Constitutional History
- Local and Family History
- Maritime History
- Military History
- National Liberation and Post-Colonialism
- Oral History
- Political History
- Public History
- Regional and National History
- Revolutions and Rebellions
- Slavery and Abolition of Slavery
- Social and Cultural History
- Theory, Methods, and Historiography
- Urban History
- World History
- Browse content in Language Teaching and Learning
- Language Learning (Specific Skills)
- Language Teaching Theory and Methods
- Browse content in Linguistics
- Applied Linguistics
- Cognitive Linguistics
- Computational Linguistics
- Forensic Linguistics
- Grammar, Syntax and Morphology
- Historical and Diachronic Linguistics
- History of English
- Language Variation
- Language Families
- Language Evolution
- Language Reference
- Language Acquisition
- Lexicography
- Linguistic Theories
- Linguistic Typology
- Linguistic Anthropology
- Phonetics and Phonology
- Psycholinguistics
- Sociolinguistics
- Translation and Interpretation
- Writing Systems
- Browse content in Literature
- Bibliography
- Children's Literature Studies
- Literary Studies (Modernism)
- Literary Studies (Romanticism)
- Literary Studies (American)
- Literary Studies (Asian)
- Literary Studies (European)
- Literary Studies (Eco-criticism)
- Literary Studies - World
- Literary Studies (1500 to 1800)
- Literary Studies (19th Century)
- Literary Studies (20th Century onwards)
- Literary Studies (African American Literature)
- Literary Studies (British and Irish)
- Literary Studies (Early and Medieval)
- Literary Studies (Fiction, Novelists, and Prose Writers)
- Literary Studies (Gender Studies)
- Literary Studies (Graphic Novels)
- Literary Studies (History of the Book)
- Literary Studies (Plays and Playwrights)
- Literary Studies (Poetry and Poets)
- Literary Studies (Postcolonial Literature)
- Literary Studies (Queer Studies)
- Literary Studies (Science Fiction)
- Literary Studies (Travel Literature)
- Literary Studies (War Literature)
- Literary Studies (Women's Writing)
- Literary Theory and Cultural Studies
- Mythology and Folklore
- Shakespeare Studies and Criticism
- Browse content in Media Studies
- Browse content in Music
- Applied Music
- Dance and Music
- Ethics in Music
- Ethnomusicology
- Gender and Sexuality in Music
- Medicine and Music
- Music Cultures
- Music and Culture
- Music and Media
- Music and Religion
- Music Education and Pedagogy
- Music Theory and Analysis
- Musical Scores, Lyrics, and Libretti
- Musical Structures, Styles, and Techniques
- Musicology and Music History
- Performance Practice and Studies
- Race and Ethnicity in Music
- Sound Studies
- Browse content in Performing Arts
- Browse content in Philosophy
- Aesthetics and Philosophy of Art
- Epistemology
- Feminist Philosophy
- History of Western Philosophy
- Meta-Philosophy
- Metaphysics
- Moral Philosophy
- Non-Western Philosophy
- Philosophy of Action
- Philosophy of Law
- Philosophy of Religion
- Philosophy of Language
- Philosophy of Mind
- Philosophy of Perception
- Philosophy of Science
- Philosophy of Mathematics and Logic
- Practical Ethics
- Social and Political Philosophy
- Browse content in Religion
- Biblical Studies
- Christianity
- East Asian Religions
- History of Religion
- Judaism and Jewish Studies
- Qumran Studies
- Religion and Education
- Religion and Health
- Religion and Politics
- Religion and Science
- Religion and Law
- Religion and Art, Literature, and Music
- Religious Studies
- Browse content in Society and Culture
- Cookery, Food, and Drink
- Cultural Studies
- Customs and Traditions
- Ethical Issues and Debates
- Hobbies, Games, Arts and Crafts
- Natural world, Country Life, and Pets
- Popular Beliefs and Controversial Knowledge
- Sports and Outdoor Recreation
- Technology and Society
- Travel and Holiday
- Visual Culture
- Browse content in Law
- Arbitration
- Browse content in Company and Commercial Law
- Commercial Law
- Company Law
- Browse content in Comparative Law
- Systems of Law
- Competition Law
- Browse content in Constitutional and Administrative Law
- Government Powers
- Judicial Review
- Local Government Law
- Military and Defence Law
- Parliamentary and Legislative Practice
- Construction Law
- Contract Law
- Browse content in Criminal Law
- Criminal Procedure
- Criminal Evidence Law
- Sentencing and Punishment
- Employment and Labour Law
- Environment and Energy Law
- Browse content in Financial Law
- Banking Law
- Insolvency Law
- History of Law
- Human Rights and Immigration
- Intellectual Property Law
- Browse content in International Law
- Private International Law and Conflict of Laws
- Public International Law
- IT and Communications Law
- Jurisprudence and Philosophy of Law
- Law and Society
- Law and Politics
- Browse content in Legal System and Practice
- Courts and Procedure
- Legal Skills and Practice
- Legal System - Costs and Funding
- Primary Sources of Law
- Regulation of Legal Profession
- Medical and Healthcare Law
- Browse content in Policing
- Criminal Investigation and Detection
- Police and Security Services
- Police Procedure and Law
- Police Regional Planning
- Browse content in Property Law
- Personal Property Law
- Restitution
- Study and Revision
- Terrorism and National Security Law
- Browse content in Trusts Law
- Wills and Probate or Succession
- Browse content in Medicine and Health
- Browse content in Allied Health Professions
- Arts Therapies
- Clinical Science
- Dietetics and Nutrition
- Occupational Therapy
- Operating Department Practice
- Physiotherapy
- Radiography
- Speech and Language Therapy
- Browse content in Anaesthetics
- General Anaesthesia
- Clinical Neuroscience
- Browse content in Clinical Medicine
- Acute Medicine
- Cardiovascular Medicine
- Clinical Genetics
- Clinical Pharmacology and Therapeutics
- Dermatology
- Endocrinology and Diabetes
- Gastroenterology
- Genito-urinary Medicine
- Geriatric Medicine
- Infectious Diseases
- Medical Oncology
- Medical Toxicology
- Pain Medicine
- Palliative Medicine
- Rehabilitation Medicine
- Respiratory Medicine and Pulmonology
- Rheumatology
- Sleep Medicine
- Sports and Exercise Medicine
- Community Medical Services
- Critical Care
- Emergency Medicine
- Forensic Medicine
- Haematology
- History of Medicine
- Medical Ethics
- Browse content in Medical Skills
- Clinical Skills
- Communication Skills
- Nursing Skills
- Surgical Skills
- Browse content in Medical Dentistry
- Oral and Maxillofacial Surgery
- Paediatric Dentistry
- Restorative Dentistry and Orthodontics
- Surgical Dentistry
- Medical Statistics and Methodology
- Browse content in Neurology
- Clinical Neurophysiology
- Neuropathology
- Nursing Studies
- Browse content in Obstetrics and Gynaecology
- Gynaecology
- Occupational Medicine
- Ophthalmology
- Otolaryngology (ENT)
- Browse content in Paediatrics
- Neonatology
- Browse content in Pathology
- Chemical Pathology
- Clinical Cytogenetics and Molecular Genetics
- Histopathology
- Medical Microbiology and Virology
- Patient Education and Information
- Browse content in Pharmacology
- Psychopharmacology
- Browse content in Popular Health
- Caring for Others
- Complementary and Alternative Medicine
- Self-help and Personal Development
- Browse content in Preclinical Medicine
- Cell Biology
- Molecular Biology and Genetics
- Reproduction, Growth and Development
- Primary Care
- Professional Development in Medicine
- Browse content in Psychiatry
- Addiction Medicine
- Child and Adolescent Psychiatry
- Forensic Psychiatry
- Learning Disabilities
- Old Age Psychiatry
- Psychotherapy
- Browse content in Public Health and Epidemiology
- Epidemiology
- Public Health
- Browse content in Radiology
- Clinical Radiology
- Interventional Radiology
- Nuclear Medicine
- Radiation Oncology
- Reproductive Medicine
- Browse content in Surgery
- Cardiothoracic Surgery
- Gastro-intestinal and Colorectal Surgery
- General Surgery
- Neurosurgery
- Paediatric Surgery
- Peri-operative Care
- Plastic and Reconstructive Surgery
- Surgical Oncology
- Transplant Surgery
- Trauma and Orthopaedic Surgery
- Vascular Surgery
- Browse content in Science and Mathematics
- Browse content in Biological Sciences
- Aquatic Biology
- Biochemistry
- Bioinformatics and Computational Biology
- Developmental Biology
- Ecology and Conservation
- Evolutionary Biology
- Genetics and Genomics
- Microbiology
- Molecular and Cell Biology
- Natural History
- Plant Sciences and Forestry
- Research Methods in Life Sciences
- Structural Biology
- Systems Biology
- Zoology and Animal Sciences
- Browse content in Chemistry
- Analytical Chemistry
- Computational Chemistry
- Crystallography
- Environmental Chemistry
- Industrial Chemistry
- Inorganic Chemistry
- Materials Chemistry
- Medicinal Chemistry
- Mineralogy and Gems
- Organic Chemistry
- Physical Chemistry
- Polymer Chemistry
- Study and Communication Skills in Chemistry
- Theoretical Chemistry
- Browse content in Computer Science
- Artificial Intelligence
- Computer Architecture and Logic Design
- Game Studies
- Human-Computer Interaction
- Mathematical Theory of Computation
- Programming Languages
- Software Engineering
- Systems Analysis and Design
- Virtual Reality
- Browse content in Computing
- Business Applications
- Computer Games
- Computer Security
- Computer Networking and Communications
- Digital Lifestyle
- Graphical and Digital Media Applications
- Operating Systems
- Browse content in Earth Sciences and Geography
- Atmospheric Sciences
- Environmental Geography
- Geology and the Lithosphere
- Maps and Map-making
- Meteorology and Climatology
- Oceanography and Hydrology
- Palaeontology
- Physical Geography and Topography
- Regional Geography
- Soil Science
- Urban Geography
- Browse content in Engineering and Technology
- Agriculture and Farming
- Biological Engineering
- Civil Engineering, Surveying, and Building
- Electronics and Communications Engineering
- Energy Technology
- Engineering (General)
- Environmental Science, Engineering, and Technology
- History of Engineering and Technology
- Mechanical Engineering and Materials
- Technology of Industrial Chemistry
- Transport Technology and Trades
- Browse content in Environmental Science
- Applied Ecology (Environmental Science)
- Conservation of the Environment (Environmental Science)
- Environmental Sustainability
- Environmentalist Thought and Ideology (Environmental Science)
- Management of Land and Natural Resources (Environmental Science)
- Natural Disasters (Environmental Science)
- Nuclear Issues (Environmental Science)
- Pollution and Threats to the Environment (Environmental Science)
- Social Impact of Environmental Issues (Environmental Science)
- History of Science and Technology
- Browse content in Materials Science
- Ceramics and Glasses
- Composite Materials
- Metals, Alloying, and Corrosion
- Nanotechnology
- Browse content in Mathematics
- Applied Mathematics
- Biomathematics and Statistics
- History of Mathematics
- Mathematical Education
- Mathematical Finance
- Mathematical Analysis
- Numerical and Computational Mathematics
- Probability and Statistics
- Pure Mathematics
- Browse content in Neuroscience
- Cognition and Behavioural Neuroscience
- Development of the Nervous System
- Disorders of the Nervous System
- History of Neuroscience
- Invertebrate Neurobiology
- Molecular and Cellular Systems
- Neuroendocrinology and Autonomic Nervous System
- Neuroscientific Techniques
- Sensory and Motor Systems
- Browse content in Physics
- Astronomy and Astrophysics
- Atomic, Molecular, and Optical Physics
- Biological and Medical Physics
- Classical Mechanics
- Computational Physics
- Condensed Matter Physics
- Electromagnetism, Optics, and Acoustics
- History of Physics
- Mathematical and Statistical Physics
- Measurement Science
- Nuclear Physics
- Particles and Fields
- Plasma Physics
- Quantum Physics
- Relativity and Gravitation
- Semiconductor and Mesoscopic Physics
- Browse content in Psychology
- Affective Sciences
- Clinical Psychology
- Cognitive Neuroscience
- Cognitive Psychology
- Criminal and Forensic Psychology
- Developmental Psychology
- Educational Psychology
- Evolutionary Psychology
- Health Psychology
- History and Systems in Psychology
- Music Psychology
- Neuropsychology
- Organizational Psychology
- Psychological Assessment and Testing
- Psychology of Human-Technology Interaction
- Psychology Professional Development and Training
- Research Methods in Psychology
- Social Psychology
- Browse content in Social Sciences
- Browse content in Anthropology
- Anthropology of Religion
- Human Evolution
- Medical Anthropology
- Physical Anthropology
- Regional Anthropology
- Social and Cultural Anthropology
- Theory and Practice of Anthropology
- Browse content in Business and Management
- Business History
- Business Ethics
- Business Strategy
- Business and Technology
- Business and Government
- Business and the Environment
- Comparative Management
- Corporate Governance
- Corporate Social Responsibility
- Entrepreneurship
- Health Management
- Human Resource Management
- Industrial and Employment Relations
- Industry Studies
- Information and Communication Technologies
- International Business
- Knowledge Management
- Management and Management Techniques
- Operations Management
- Organizational Theory and Behaviour
- Pensions and Pension Management
- Public and Nonprofit Management
- Social Issues in Business and Management
- Strategic Management
- Supply Chain Management
- Browse content in Criminology and Criminal Justice
- Criminal Justice
- Criminology
- Forms of Crime
- International and Comparative Criminology
- Youth Violence and Juvenile Justice
- Development Studies
- Browse content in Economics
- Agricultural, Environmental, and Natural Resource Economics
- Asian Economics
- Behavioural Finance
- Behavioural Economics and Neuroeconomics
- Econometrics and Mathematical Economics
- Economic Methodology
- Economic History
- Economic Systems
- Economic Development and Growth
- Financial Markets
- Financial Institutions and Services
- General Economics and Teaching
- Health, Education, and Welfare
- History of Economic Thought
- International Economics
- Labour and Demographic Economics
- Law and Economics
- Macroeconomics and Monetary Economics
- Microeconomics
- Public Economics
- Urban, Rural, and Regional Economics
- Welfare Economics
- Browse content in Education
- Adult Education and Continuous Learning
- Care and Counselling of Students
- Early Childhood and Elementary Education
- Educational Equipment and Technology
- Educational Strategies and Policy
- Higher and Further Education
- Organization and Management of Education
- Philosophy and Theory of Education
- Schools Studies
- Secondary Education
- Teaching of a Specific Subject
- Teaching of Specific Groups and Special Educational Needs
- Teaching Skills and Techniques
- Browse content in Environment
- Applied Ecology (Social Science)
- Climate Change
- Conservation of the Environment (Social Science)
- Environmentalist Thought and Ideology (Social Science)
- Management of Land and Natural Resources (Social Science)
- Natural Disasters (Environment)
- Pollution and Threats to the Environment (Social Science)
- Social Impact of Environmental Issues (Social Science)
- Sustainability
- Browse content in Human Geography
- Cultural Geography
- Economic Geography
- Political Geography
- Browse content in Interdisciplinary Studies
- Communication Studies
- Museums, Libraries, and Information Sciences
- Browse content in Politics
- African Politics
- Asian Politics
- Chinese Politics
- Comparative Politics
- Conflict Politics
- Elections and Electoral Studies
- Environmental Politics
- Ethnic Politics
- European Union
- Foreign Policy
- Gender and Politics
- Human Rights and Politics
- Indian Politics
- International Relations
- International Organization (Politics)
- Irish Politics
- Latin American Politics
- Middle Eastern Politics
- Political Theory
- Political Behaviour
- Political Economy
- Political Institutions
- Political Methodology
- Political Communication
- Political Philosophy
- Political Sociology
- Politics and Law
- Politics of Development
- Public Policy
- Public Administration
- Qualitative Political Methodology
- Quantitative Political Methodology
- Regional Political Studies
- Russian Politics
- Security Studies
- State and Local Government
- UK Politics
- US Politics
- Browse content in Regional and Area Studies
- African Studies
- Asian Studies
- East Asian Studies
- Japanese Studies
- Latin American Studies
- Middle Eastern Studies
- Native American Studies
- Scottish Studies
- Browse content in Research and Information
- Research Methods
- Browse content in Social Work
- Addictions and Substance Misuse
- Adoption and Fostering
- Care of the Elderly
- Child and Adolescent Social Work
- Couple and Family Social Work
- Direct Practice and Clinical Social Work
- Emergency Services
- Human Behaviour and the Social Environment
- International and Global Issues in Social Work
- Mental and Behavioural Health
- Social Justice and Human Rights
- Social Policy and Advocacy
- Social Work and Crime and Justice
- Social Work Macro Practice
- Social Work Practice Settings
- Social Work Research and Evidence-based Practice
- Welfare and Benefit Systems
- Browse content in Sociology
- Childhood Studies
- Community Development
- Comparative and Historical Sociology
- Disability Studies
- Economic Sociology
- Gender and Sexuality
- Gerontology and Ageing
- Health, Illness, and Medicine
- Marriage and the Family
- Migration Studies
- Occupations, Professions, and Work
- Organizations
- Population and Demography
- Race and Ethnicity
- Social Theory
- Social Movements and Social Change
- Social Research and Statistics
- Social Stratification, Inequality, and Mobility
- Sociology of Religion
- Sociology of Education
- Sport and Leisure
- Urban and Rural Studies
- Browse content in Warfare and Defence
- Defence Strategy, Planning, and Research
- Land Forces and Warfare
- Military Administration
- Military Life and Institutions
- Naval Forces and Warfare
- Other Warfare and Defence Issues
- Peace Studies and Conflict Resolution
- Weapons and Equipment
- < Previous chapter
- Next chapter >
2 Pre-Experimental Research Designs
- Published: February 2012
- Cite Icon Cite
- Permissions Icon Permissions
The simplest of the group research designs involve the assessment of the functioning of a single group of persons who receive social work services. These methods are called pre-experimental designs. Tightly controlled studies done in laboratory or special treatment settings are known as efficacy studies, and are used to demonstrate if a given treatment can produce positive results under ideal conditions. Outcome studies done with more clinically representative clients and therapists, in real world agency settings, are known as effectiveness studies. Ideally the latter are conducted after the former, under conditions of increasing complexity, so as to determine treatments that work well in real-world contexts. Among the pre-experimental designs are the one group posttreatment-only study and the one group pretest-posttest design. Various ways in which these designs can be strengthened are presented, along with descriptions of published articles illustrating their use in social work and other human service settings. The limitations of these designs are also discussed, as is a review of the major threats to internal validity that can inhibit causal inferences.
Personal account
- Sign in with email/username & password
- Get email alerts
- Save searches
- Purchase content
- Activate your purchase/trial code
- Add your ORCID iD
Institutional access
Sign in with a library card.
- Sign in with username/password
- Recommend to your librarian
- Institutional account management
- Get help with access
Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:
IP based access
Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.
Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.
- Click Sign in through your institution.
- Select your institution from the list provided, which will take you to your institution's website to sign in.
- When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
- Following successful sign in, you will be returned to Oxford Academic.
If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.
Enter your library card number to sign in. If you cannot sign in, please contact your librarian.
Society Members
Society member access to a journal is achieved in one of the following ways:
Sign in through society site
Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:
- Click Sign in through society site.
- When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.
If you do not have a society account or have forgotten your username or password, please contact your society.
Sign in using a personal account
Some societies use Oxford Academic personal accounts to provide access to their members. See below.
A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.
Some societies use Oxford Academic personal accounts to provide access to their members.
Viewing your signed in accounts
Click the account icon in the top right to:
- View your signed in personal account and access account management features.
- View the institutional accounts that are providing access.
Signed in but can't access content
Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.
For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.
Our books are available by subscription or purchase to libraries and institutions.
Month: | Total Views: |
---|---|
October 2022 | 9 |
November 2022 | 15 |
December 2022 | 7 |
January 2023 | 7 |
February 2023 | 9 |
March 2023 | 15 |
April 2023 | 14 |
May 2023 | 15 |
June 2023 | 11 |
July 2023 | 9 |
August 2023 | 23 |
September 2023 | 15 |
October 2023 | 9 |
November 2023 | 31 |
December 2023 | 18 |
January 2024 | 7 |
February 2024 | 27 |
March 2024 | 11 |
April 2024 | 17 |
May 2024 | 12 |
June 2024 | 14 |
July 2024 | 21 |
August 2024 | 7 |
September 2024 | 10 |
- About Oxford Academic
- Publish journals with us
- University press partners
- What we publish
- New features
- Open access
- Rights and permissions
- Accessibility
- Advertising
- Media enquiries
- Oxford University Press
- Oxford Languages
- University of Oxford
Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide
- Copyright © 2024 Oxford University Press
- Cookie settings
- Cookie policy
- Privacy policy
- Legal notice
This Feature Is Available To Subscribers Only
Sign In or Create an Account
This PDF is available to Subscribers Only
For full access to this pdf, sign in to an existing account, or purchase an annual subscription.
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
The PMC website is updating on October 15, 2024. Learn More or Try it out now .
- Advanced Search
- Journal List
- Front Psychol
Evaluating Intervention Programs with a Pretest-Posttest Design: A Structural Equation Modeling Approach
Guido alessandri.
1 Department of Psychology, Sapienza University of Rome, Rome, Italy
Antonio Zuffianò
2 Department of Psychology, Liverpool Hope University, Liverpool, UK
Enrico Perinelli
Associated data.
A common situation in the evaluation of intervention programs is the researcher's possibility to rely on two waves of data only (i.e., pretest and posttest), which profoundly impacts on his/her choice about the possible statistical analyses to be conducted. Indeed, the evaluation of intervention programs based on a pretest-posttest design has been usually carried out by using classic statistical tests, such as family-wise ANOVA analyses, which are strongly limited by exclusively analyzing the intervention effects at the group level. In this article, we showed how second order multiple group latent curve modeling (SO-MG-LCM) could represent a useful methodological tool to have a more realistic and informative assessment of intervention programs with two waves of data. We offered a practical step-by-step guide to properly implement this methodology, and we outlined the advantages of the LCM approach over classic ANOVA analyses. Furthermore, we also provided a real-data example by re-analyzing the implementation of the Young Prosocial Animation, a universal intervention program aimed at promoting prosociality among youth. In conclusion, albeit there are previous studies that pointed to the usefulness of MG-LCM to evaluate intervention programs (Muthén and Curran, 1997 ; Curran and Muthén, 1999 ), no previous study showed that it is possible to use this approach even in pretest-posttest (i.e., with only two time points) designs. Given the advantages of latent variable analyses in examining differences in interindividual and intraindividual changes (McArdle, 2009 ), the methodological and substantive implications of our proposed approach are discussed.
Introduction
Evaluating intervention programs is at the core of many educational and clinical psychologists' research agenda (Malti et al., 2016 ; Achenbach, 2017 ). From a methodological perspective, collecting data from several points in time (usually T ≥ 3) is important to test the long-term strength of intervention effects once the treatment is completed, such as in classic designs including pretest, posttest, and follow up assessments (Roberts and Ilardi, 2003 ). However, several factors could hinder the researcher's capacity to collect data at follow-up assessments, in particular the lack of funds, participants' poor level of monitoring compliance, participants' relocation in different areas, etc. Accordingly, the use of the less advantageous pretest-posttest design (i.e., before and after the intervention) often represents a widely used methodological choice in the psychological intervention field. Indeed, from a literature research on the database PsycINFO using the following string “ intervention AND pretest AND posttest AND follow-up ” limited to abstract section and with a publication date from January 2006 to December 2016, we obtained 260 documents. When we changed “AND follow-up ” with “NOT follow-up ” the results were 1,544 (see Appendix A to replicate these literature search strategies).
A further matter of concern arises from the statistical approaches commonly used for evaluating intervention programs in pretest-posttest design, mostly ANOVA-family analyses, which heavily rely on statistical assumptions (e.g., normality, homogeneity of variance, independence of observations, absence of measurement error, and so on) rarely met in psychological research (Schmider et al., 2010 ; Nimon, 2012 ).
However, all is not lost and some analytical tools are available to help researchers better assess the efficacy of programs based on a pretest-posttest design (see McArdle, 2009 ). The goal of this article is to offer a formal presentation of a latent curve model approach (LCM; Muthén and Curran, 1997 ) to analyze intervention effects with only two waves of data. After a brief overview of the advantageous of the LCM framework over classic ANOVA analyses, a step-by-step application of the LCM on real pretest-posttest intervention data is provided.
Evaluation approaches: observed variables vs. latent variables
Broadly speaking, approaches to intervention evaluation can be distinguished into two categories: (1) approaches using observed variables and (2) approaches using latent variables . The first category includes widely used parametric tests such as Student's t , repeated measures analysis of variance (RM-ANOVA), analysis of covariance (ANCOVA), and ordinary least-squares regression (see Tabachnick and Fidell, 2013 ). However, despite their broad use, observed variable approaches suffer from several limitations, many of them ingenerated by the strong underlying statistical assumptions that must be satisfied. A first series of assumption underlying classic parametric tests is that the data being analyzed are normally distributed and have equal population variances (also called homogeneity of variance or homoscedasticity assumption). Normality assumption is not always met in real data, especially when the variables targeted by the treatment program are infrequent behaviors (i.e., externalizing conducts) or clinical syndromes (Micceri, 1989 ). Likewise, homoschedasticy assumption is rarely met in randomized control trial as a result of the experimental variable causing differences in variability between groups (Grissom and Kim, 2012 ). Violation of normality and homoscedasticity assumptions can compromise the results of classic parametric tests, in particular on rates of Type-I (Tabachnick and Fidell, 2013 ) and Type-II error (Wilcox, 1998 ). Furthermore, the inability to deal with measurement error can also lower the accuracy of inferences based on regression and ANOVA-family techniques which assume that the variables are measured without errors. However, the presence of some degree of measurement error is a common situation in psychological research where the focus is often on not directly observable constructs such as depression, self-esteem, or intelligence. Finally, observed variable approaches assume (without testing it) that the measurement structure of the construct under investigation is invariant across groups and/or time (Meredith and Teresi, 2006 ; Millsap, 2011 ). Thus, lack of satisfied statistical assumptions and/or uncontrolled unreliability can lead to the under or overestimation of the true relations among the constructs analyzed (for a detailed discussion of these issues, see Cole and Preacher, 2014 ).
On the other side, latent variable approaches refer to the class of techniques termed under the label structural equation modeling (SEM; Bollen, 1989 ) such as confirmatory factor analysis (CFA; Brown, 2015 ) and mean and covariance structures analysis (MACS; Little, 1997 ). Although a complete overview of the benefits of SEM is beyond the scope of the present work (for a thorough discussion, see Little, 2013 ; Kline, 2016 ), it is worthwhile mentioning here those advantages that directly relate to the evaluation of intervention programs. First, SEM can easily accommodate the lack of normality in the data. Indeed, several estimation methods with standard errors robust to non-normal data are available and easy-to-use in many popular statistical programs (e.g., MLM, MLR, WLSMV, etc. in M plus ; Muthén and Muthén, 1998–2012 ). Second, SEM explicitly accounts for measurement error by separating the common variance among the indicators of a given construct (i.e., the latent variable) from their residual variances (which include both measurement error and unique sources of variability). Third, if multiple items from a scale are used to assess a construct, SEM allows the researcher to evaluate to what extent the measurement structure (i.e., factor loadings, item intercepts, residual variances, etc.) of such scale is equivalent across groups (e.g., intervention group vs. control group) and/or over time (i.e., pretest and posttest); this issue is known as measurement invariance (MI) and, despite its crucial importance for properly interpreting psychological findings, is rarely tested in psychological research (for an overview see Millsap, 2011 ; Brown, 2015 ). Finally, different competitive SEMs can be evaluated and compared according to their goodness of fit (Kline, 2016 ). Many SEM programs, indeed, print in their output a series of fit indexes that help the researcher assess whether the hypothesized model is consistent with the data or not. In sum, when multiple indicators of the constructs of interest are available (e.g., multiple items from one scale, different informants, multiple methods, etc.), latent variables approaches offer many advantages and, therefore, they should be preferred over manifest variable approaches (Little et al., 2009 ). Moreover, when a construct is measured using a single psychometric measure, there are still ways to incorporate the individuals' scores in the analyses as latent variables, and thus reduce the impact of measurement unreliability (Cole and Preacher, 2014 ).
Latent curve models
Among latent variable models of change, latent curve models (LCMs; Meredith and Tisak, 1990 ), represent a useful and versatile tool to model stability and change in the outcomes targeted by an intervention program (Muthén and Curran, 1997 ; Curran and Muthén, 1999 ). Specifically, in LCM individual differences in the rate of change can be flexibly modeled through the use of two continuous random latent variables : The intercept (which usually represents the level of the outcome of interest at the pretest) and the slope (i.e., the mean-level change over time from the pretest to the posttest). In detail, both the intercept and the slope have a mean (i.e., the average initial level and the average rate of change, respectively) and a variance (i.e., the amount of inter-individual variability around the average initial level and the average rate of change). Importantly, if both the mean and the variance of the latent slope of the outcome y in the intervention group are statistically significant (whereas they are not significant in the control group), that means that there was not only an average effect of the intervention, but also some participants were differently affected by the program (Muthén and Curran, 1997 ). Hence, the assumption that participants respond to the treatment in the same way (as in ANOVA-family analyses) can be easily relaxed in LCM. Indeed, although individual differences may also be present in the ANOVA design, change occurs at the group level and, therefore, everyone is impacted in the same fashion after the exposure to the treatment condition.
As discussed by Muthén and Curran ( 1997 ), the LCM approach is particular useful for evaluating intervention effects when it is conducted within a multiple group framework (i.e., MG-LCM), namely when the intercept and the slope of the outcome of interest are simultaneously estimated in the intervention and control group. Indeed, as illustrate in our example, the MG-LCM allows the research to test if both the mean and the variability of the outcome y at the pretest are similar across intervention and control groups, as well as if the mean rate of change and its inter-individual variability are similar between the two groups. Therefore, the MG-LCM provides information about the efficacy of an intervention program in terms of both (1) its average (i.e., group-level) effect and (2) participants' sensitivity to differently respond to the treatment condition.
However, a standard MG-LCM cannot be empirically identified with two waves of data (Bollen and Curran, 2006 ). Yet, the use of multiple indicators (at least 2) for each construct of interest could represent a possible solution to overcome this problem by allowing the estimation of the intercept and slope as second-order latent variables (McArdle, 2009 ; Geiser et al., 2013 ; Bishop et al., 2015 ). Interestingly, although second-order LCMs are becoming increasingly common in psychological research due to their higher statistical power to detect changes over time in the variables of interest (Geiser et al., 2013 ), their use in the evaluation of intervention programs is still less frequent. In the next section, we present a formal overview of a second-order MG-LCM approach, we describe the possible models of change that can be tested to assess intervention effects in pretest-posttest design, and we show an application of the model to real data.
Identification of a two-time point latent curve model using parallel indicators
When only two points in time are available, it is possible to estimate two LCMs: A No-Change Model (see Figure Figure1 1 Panel A) and a Latent Change Model (see Figure Figure1 1 Panel B). In the following, we described in details the statistical underpinnings of both these models.
Second Order Latent Curve Models with parallel indicators (i.e., residual variances of observed indicators are equal within the same latent variable: ε 1 within η 1 and ε 2 within η 2 ) . All the intercepts of the observed indicators (Y) and endogenous latent variables (η) are fixed to 0 (not reported in figure). In model A, the residual variances of η 1 and η 2 (ζ 1 and ζ 2 , respectively) are freely estimated, whereas in Model B they are fixed to 0. ξ 1 , intercept; ξ 2 , slope; κ 1 , mean of intercept; κ 2 , mean of slope; ϕ 1 , variance of intercept; ϕ 2 , variance of slope; ϕ 12 , covariance between intercept and slope; η 1 , latent variable at T1; η 2 , latent variable at T2; Y, observed indicator of η; ε, residual variance/covariance of observed indicators.
Latent change model
A two-time points latent change model implies two latent means (κ k ), two latent factor variances (ζ k ), plus the covariance between the intercept and slope factor (Φ k ). This results in a total of 5+T model parameters, where T are the error variances for (y k ) when allowing VAR (∈ k ) to change over time. In the case of a two waves of data (i.e., T = 2), this latent change model has 7 parameters to estimate from a total of (2) (3)/2+2 = 5 identified means, variances, and covariances of the observed variables. Hence, two waves of data are insufficient to estimate this model. However, this latent change model can be just-identified (i.e., zero degrees of freedom [df]) by constraining the residual variances of the observed variables to be 0. This last constraint should be considered structural and thus included in all two-time points latent change model. In this latter case, the variances of the latent variables (i.e., the latent intercept representing the starting level, and the latent change score) are equivalent to those of the observed variables. Thus, when fallible variables are used, this impedes to separate true scores from their error/residual terms.
A possible way to allow this latent change model to be over-identified (i.e., df ≥ 1) is by assuming the availability of at least two observed indicators of the construct of interest at each time point (i.e., T1 and T2). Possible examples include the presence of two informants rating the same behavior (e.g., caregivers and teachers), two scales assessing the same construct, etc. However, even if the construct of interest is assessed by only one single scale, it should be noted that psychological instruments are often composed by several items. Hence, as noted by Steyer et al. ( 1997 ), it is possible to randomly partitioning the items composing the scale into two (or more) parcels that can be treated as parallel forms. By imposing appropriate constraints on the loadings (i.e., λ k = 1), the intercepts (τ k = 0), within factor residuals (ε k = ε), and by fixing to 0 the residual variances of the first-order latent variables η k (ζ k = 0), the model can be specified as a first-order measurement model plus a second-order latent change model (see Figure Figure1 1 Panel B). Given previous constraints of loadings, intercepts, and first order factor residual variances, this model is over-identified because we have (4) (5)/2+4 = 14 observed variances, covariances, and means. Of course, when three or more indicators are available, identification issues cease to be a problem. In this paper, we restricted our attention to the two parallel indicators case to address the more basic situation that a researcher can encounter in the evaluation of a two time-point intervention. Yet, our procedure can be easily extended to cases in which three or more indicators are available at each time point.
Specification
More formally, and under usual assumptions (Meredith and Tisak, 1990 ), the measurement model for the above two times latent change model in group k becomes:
where y k is a mp x 1 random vector that contains the observed scores, { y i t k } , for the ith variable at time t , i ∈ {1,2,., p}, and t ∈ {1,2,., m}. The intercepts are contained in the mp x 1 vector τ y k , Λ y k is a mp x mq matrix of factor loadings, η k is a mq x 1 vector of factor scores, and the unobserved error random vectors ∈ k is a mp x 1 vector. The population vector mean, μ y k , and covariance matrix, ∑ y k , or Means and Covariance Structure (MACS) are:
where μ η k is a vector of latent factors means, ∑ η k is the modeled covariance matrix, and θ ε k is a mp × mp matrix of observed variable residual covariances. For each column, fixing an element of Λ y k to 1, and an element of τ y k to 0, identifies the model. By imposing increasingly restrictive constraints on elements of matrix Λ y and τ y , the above two-indicator two-time points model can be identified.
The general equations for the structural part of a second order (SO) multiple group (MG) model are:
where Γ k is a mp x qr matrix containing second order factor coefficients, ξ k is a qr × 1 vector of second-order latent variables, and ζ k is a mq x 1 vector containing latent variable disturbance scores. Note that q is the number of latent factors and that r is the number of latent curves for each latent factor.
The population mean vector, μ η k , and covariance matrix, ∑ η k , based on (3) are
where Φ k is a r x r covariance of the latent variables, and Ψ k is a mq × mq latent variable residual covariance matrix. In the current application, what makes the difference in two models is the way in which matrices Γ k and Φ k are specified.
Application of the SO-MG-LCM to intervention studies using a pretest-posttest design
The application of the above two-times LCM to the evaluation of an intervention is straightforward. Usually, in intervention studies, individuals are randomly assigned to two different groups. The first group ( G 1 ) is exposed to an intervention that takes place somewhere after the initial time point. The second group ( G 2 ), also called the control group, does not receive any direct experimental manipulation. In light of the random assignment, G 1 and G 2 can be viewed as two equivalent groups drawn by the same population and the effect of the intervention may be ascertained by comparing individuals' changes from T1 to T2 across these two groups.
Following Muthén and Curran ( 1997 ), an intercept factor should be modeled in both groups. However, only in the intervention group an additional latent change factor should be added. This factor is aimed at capturing the degree of change that is specific to the treatment group. Whereas, the absolute value for the latent mean of this factor can be interpreted as the change determined by the intervention in the intervention group, a significant variance indicates a meaningful heterogeneity in responding to the treatment. In this model α y k is a vector containing freely estimating mean values for the intercept (i.e., ξ 1 ), and the slope (i.e., ξ 2 ). Γ y k is thus a 2 x 2 matrix, containing basis coefficients, determined in [ 1 1 ] for the intercept (i.e., ξ 1 ) and [ 0 1 ] for the slope (i.e., ξ 2 ). Φ k is a 2 x 2 matrix containing variances and covariance for the two latent factors representing the intercept and the slope.
Given randomization, restricting the parameters of the intercept to be equal across the control and treatment populations is warranted in a randomized intervention study. Yet, baseline differences can be introduced in field studies where randomization is not possible or, simply, the randomization failed during the course of the study (Cook and Campbell, 1979 ). In such cases, the equality constraints related to the mean or to the variance of the intercept can be relaxed.
The influence of participants' initial status on the effect of the treatment in the intervention group can also be incorporated in the model (Cronbach and Snow, 1977 ; Muthén and Curran, 1997 ; Curran and Muthén, 1999 ) by regressing the latent change factor onto the intercept factor, so that the mean and variance of the latent change factor in the intervention group are expressed as a function of the initial status. Accordingly, this analysis captures to what extent inter-individual initial differences on the targeted outcome can predispose participants to differently respond to the treatment delivered.
Sequence of models
We suggest a four-step approach to intervention evaluation. By comparing the relative fit of each model, researchers can have important information to assess the efficacy of their intervention.
Model 1: no-change model
A no-change model is specified for both intervention group (henceforth G1) and for control group (henceforth G2). As a first step, indeed, a researcher may assume that the intervention has not produced any meaningful effect, and therefore a no-change model (or strict stability model) should be simultaneously estimated in both the intervention and control group. In its more general version, the no-change model includes only a second-order intercept factor which represents the participants' initial level. Importantly, both the mean and variance of the second-order intercept factor are freely estimated across groups (see Figure Figure1 1 Panel A). More formally, in this model, Φ k is a qr x qr covariance matrix of the latent variables, and Γ k is a mq x qr matrix, containing for each latent variable, a set of basis coefficients for the latent curves.
Model 2: latent change model in the intervention group
In this model, a slope growth factor is estimated in the intervention group only. As previously detailed, this additional latent factor is aimed at capturing any possible change in the intervention group. According to our premises, this model represents the “target” model, attesting a significant intervention effect in G1 but not in G2. Model 1 is then compared with Model 2 and changes in fit indexes between the two models are used to evaluate the need of this further latent factor (see section Statistical Analysis).
Model 3: latent change model in both the intervention and control group
In model 3, a latent change model is estimated simultaneously in both G1 and G2. The fit of Model 2 is compared with the fit of Model 3 and changes in fit indexes between the two models are used to evaluate the need of this further latent factor in the control group. From a conceptual point of view, the goal of Model 3 is twofold because it allows the researcher: (a) to rule out the eventuality of “contaminations effects” between the intervention and control group (Cook and Campbell, 1979 ); (b) to assess a possible, normative mean-level change in the control group (i.e., a change that cannot be attributed to the treatment delivered). In reference to (b), indeed, it should be noted that some variables may show a normative developmental increase during the period of the intervention. For instance, a consistent part of the literature has identified an overall increase in empathic capacities during early childhood (for an overview, see Eisenberg et al., 2015 ). Hence, researchers aimed at increasing empathy-related responding in young children may find that both the intervention and control group actually improved in their empathic response. In this situation, both the mean and variance of the latent slope should be constrained to equality across groups to mitigate the risk of confounding intervention effects with the normative development of the construct (for an alternative approach when more than two time points are available, see Muthén and Curran, 1997 ; Curran and Muthén, 1999 ). Importantly, the tenability of these constraints can be easily tested through a delta chi square test (Δχ 2 ) between the chi squares of the constrained model vs . unconstrained model. A significant Δχ 2 (usually p < 0.05) indicates that the two models are not statistically equivalent, and the unconstrained model should be preferred. On the contrary, a non-significant Δχ 2 (usually p > 0.05) indicates that the two models are statistically equivalent, and the constrained model (i.e., the more parsimonious model) should be preferred.
Model 4: sensitivity model
After having identified the best fitting model, the parameters of the intercept (i.e., mean and variance) should be constrained to equality across groups. This sensitivity analysis is crucial to ensure that both groups started with an equivalent initial status on the targeted behavior which is an important assumption in intervention programs. In line with previous analyses, the plausibility of initial status can be easily tested through the Δχ 2 test. Indeed, given randomization, it seems likely to assume that participants in both groups are characterized by similar or identical starting levels, and the groups have the same variability. These assumptions lead to a constrained no-change no-group difference model. This model is the same as the previous one, except that κ k = κ, or in our situation κ 1 = κ 2 . Moreover, in our situation, r = 1, q = 1, m = 2 , and hence, Φ k = Φ is a scalar, Γ k = 1 2 , and Ψ k = ΨI 2 for each of the k th population.
In the next section, the above sequence of models has been applied to the evaluation of a universal intervention program aimed to improve students' prosociality. We presented results from every step implied by the above methodology, and we offered a set of M plus syntaxes to allow researchers estimate the above models in their dataset.
The young prosocial animation program
The Young Prosocial Animation (YPA; Zuffianò et al., 2012 ) is a universal intervention program (Greenberg et al., 2001 ) to sensitize adolescents to prosocial and empathic values (Zuffianò et al., 2012 ).
In detail, the YPA tries to valorize: (a) the status of people who behave prosocially, (b) the similarity between the “model” and the participants, and (c) the outcomes related to prosocial actions. Following Bandura's ( 1977 ) concept of modeling , in fact, people are more likely to engage in those behaviors they value and if the model is perceived as similar and with an admired status . The main idea is that valuing these three aspects could foster a prosocial sensitization among the participants (Zuffianò et al., 2012 ). In other terms, the goal is to promote the cognitive and emotional aspects of prosociality, in order to strengthen attitudes to act and think in a “prosocial way.” The expected change, therefore, is at the level of the personal dispositions in terms of an increased receptiveness and propensity for prosocial thinking (i.e., both the ability to take the point of view and to be empathetic rather than directly affecting the behaviors acted out by the individuals, as well as the ability to produce ideas and solutions that can help other people; Zuffianò et al., 2012 ). Due to its characteristics, YPA can be conceived as a first phase of prosocial sensitization on which implementing programs more appropriately direct to increase prosocial behavior (e.g., CEPIDEA program; Caprara et al., 2014 ). YPA aims to achieve this goal through a guided discussion following the viewing of some prosocial scenes selected from the film “Pay It Forward” 1 . After viewing each scene, a trained researcher, using a standard protocol guides a discussion among the participants highlighting: (i) the type of prosocial action (e.g., consoling, helping, etc.); (ii) the benefits for the actor and the target of the prosocial action; (iii) possible benefits of the prosocial action extended to the context (e.g., other persons, the more broad community, etc.); (iv) requirements of the actor to behave prosocially (e.g., being empathetic, bravery, etc.); (v) the similarity between the participant and the actor of the prosocial behavior; (vi) the thoughts and the feelings experienced during the viewing of the scene. The researcher has to complete the intervention within 12 sessions (1 h per session, once a week).
For didactic purposes, in the present study we re-analyzed data from an implementation of the YPA in three schools located in a small city in the South of Italy (see Zuffianò et al., 2012 for details).
We expected Model 2 (a latent change model in the intervention group and a no-change model in the control group) to be the best fitting model. Indeed, from a developmental point of view, we had no reason to expect adolescents showing a normative change in prosociality after such a short period of time (Eisenberg et al., 2015 ). In line with the goal of the YPA, we hypothesized an small-medium increase in prosociality in the intervention group. We also expected that both groups did not differ at T1 in absolute level of prosocial behaviors, ensuring that both intervention and control group were equivalent. Finally, we explored the influence of participants' initial status on the treatment effect, a scenario in which those participants with lower initial level of prosociality benefitted more from attending the YPA session.
The study followed a quasi-experimental design , with both the intervention and control groups assessed at two different time points: Before (Time 1) YPA intervention and 6 months after (Time 2). Twelve classrooms from three schools (one middle school and two high schools) participated in the study during the school year 2008–2009. Each school has ensured the participation of 4 classes that were randomly assigned to intervention and control group (two classes to intervention group and two classes to control group). 2 In total, six classes were part of intervention group and six classes of control group. The students from the middle school were in the eighth grade (third year of secondary school in Italy), whereas the students from the two high schools were in the ninth (first year of high school in Italy) and tenth grade (second year of high school in Italy).
Participants
The YPA program was implemented in a city in the South of Italy. A total amount of 250 students participated in the study: 137 students (51.8% males) were assigned to the intervention group and 113 (54% males) to the control group. At T2 students were 113 in the intervention group (retention rate = 82.5%) and 91 in the control group (retention rate = 80.5%). Little's test of missingness at random showed a non-significant chi-squared value [ χ ( 2 ) 2 = 4.698, p = 0.10]; this means that missingness at posttest is not affected by the levels of prosociality at pretest. The mean age was 14.2 ( SD = 1.09) in intervention group, and 15.2 ( SD = 1.76) in control group. Considering socioeconomic status, the 56.8% of families in intervention group and the 60.0% in control group were one-income families. The professions mostly represented in the two groups were the “worker” among the fathers (the 36.4% in intervention group and the 27.9% in control group) and the “housewife” among the mothers (the 56.0% in the intervention group and the 55.2% in the control group). Parent's school level was approximately the same between the two groups: Most of parents in the intervention group (43.5%) and in the control group (44.7%) had a middle school degree.
Prosociality
Participants rated their prosociality on a 16-item scale (5-point Likert scale: 1 = never/almost never true ; 5 = almost always/always true ) that assesses the degree of engagement in actions aimed at sharing, helping, taking care of others' needs, and empathizing with their feelings (e.g., “ I try to help others ” and “ I try to console people who are sad ”). The alpha reliability coefficient was 0.88 at T1 and 0.87 at T2. The scale has been validated on a large sample of respondents (Caprara et al., 2005 ) and has been found to moderately correlate ( r > 0.50) with other-ratings of prosociality (Caprara et al., 2012 ).
Statistical analysis
All the preceding models were estimated by maximum likelihood (ML) using M plus program 7 (Muthén and Muthén, 1998–2012 ). Missing data were handled using full information maximum likelihood (FIML) estimation, which draws on all available data to estimate model parameters without imputing missing values (Enders, 2010 ). To evaluate the goodness of fit, we relied on different criteria. First we evaluated the values assumed by the χ 2 likelihood ratio statistic for the overall group. Given that we were interested in the relative fit of the above presented different models of change within G1 and G2, we investigated also the contribution offered by each group to the overall χ 2 value. The idea was to have a more careful indication of the impact of including the latent change factor in a specific group. We also investigated the values of the Comparative Fit Index (CFI), the Tucker Lewis Fit Index (TLI), the Root Mean Square Error of Approximation (RMSEA) with associated 90% confidence intervals, and the Root Mean Square Residuals Standardized (SRMR). We accepted CFI and TLI values >0.90, RMSEA values <0.08, and SRMR <0.08 (see Kline, 2016 ). Last, we used the Akaike Information Criteria (AIC; Burnham and Anderson, 2004 ). AIC rewards goodness of fit and includes a penalty that is an increasing function of the number of parameters estimated. Burnham and Anderson ( 2004 ) recommend rescaling all the observed AIC values before selecting the best fitting model according to the following formula: Δi = AICi-AICmin, where AICmin is the minimum of the observed AIC values (among competing models). Practical guidelines suggest that a model which differs less than Δi = 2 from the best fitting model (which has Δi = 0) in a specific dataset is said to be “strongly supported by evidence”; if the difference lies between 4 ≤ and ≤ 7 there is considerably less support, whereas models with Δi > 10 have essentially no support.
We created two parallel forms of the prosociality scale by following the procedure described in Little et al. ( 2002 , p. 166). In Table Table1 1 we reported zero-order correlations, mean, standard deviation, reliability, skewness, and kurtosis for each parallel form. Cronbach's alphas were good (≥0.74), and correlations were all significant at p < 0.001. Indices of skewness and kurtosis for each parallel form in both groups did not exceed the value of |0.61|, therefore the univariate distribution of all the eight variables (4 variables for 2 groups) did not show substantial deviations from normal distribution (Curran et al., 1996 ). In order to check multivariate normality assumptions, we computed the Mardia's two-sided multivariate test of fit for skewness and kurtosis. Given the well-known tendency of this coefficient to easily reject H 0 , we set alpha level at 0.001 (in this regard, see Mecklin and Mundfrom, 2005 ; Villasenor Alva and Estrada, 2009 ). Results of Mardia's two-sided multivariate test of fit for skewness and kurtosis showed p -value of 0.010 and 0.030 respectively. Therefore, the study variables showed an acceptable, even if not perfect, multivariate normality. Given the modest deviation from the normality assumption we decided to use Maximum Likelihood as the estimation method.
Descriptive statistics and zero-order correlations for each group separately ( N = 250) .
(1) Pr1_T1 | 137 | ||||
(2) Pr2_T1 | 0.81 | 137 | |||
(3) Pr1_T2 | 0.51 | 0.52 | 113 | ||
(4) Pr2_T2 | 0.48 | 0.59 | 0.78 | 113 | |
3.44 | 3.49 | 3.62 | 3.71 | − | |
0.75 | 0.72 | 0.60 | 0.62 | − | |
−0.51 | −0.60 | −0.34 | −0.61 | − | |
−0.06 | 0.43 | −0.13 | 0.02 | − | |
(1) Pr1_T1 | 113 | ||||
(2) Pr2_T1 | 0.76 | 113 | |||
(3) Pr1_T2 | 0.74 | 0.67 | 91 | ||
(4) Pr2_T2 | 0.65 | 0.73 | 0.78 | 91 | |
3.42 | 3.49 | 3.49 | 3.55 | − | |
0.70 | 0.71 | 0.65 | 0.64 | − | |
−0.39 | −0.55 | −0.27 | −0.41 | − | |
−0.12 | −0.01 | −0.44 | −0.49 | − |
Pr1_T1, Parallel form 1 of the Prosociality scale at Time 1; Pr2_T1, Parallel form 2 of the Prosociality scale at Time 1; Pr1_T2, Parallel form 1 of the Prosociality scale at Time 2; Pr2_T2, Parallel form 2 of the Prosociality scale at Time 2; M, mean; SD, standard deviation; Sk, skewness; Ku, kurtosis; n, number of subjects for each parallel form in each group .
Italicized numbers in diagonal are reliability coefficients (Cronbach's α) .
All correlations were significant at p ≤ 0.001 .
Evaluating the impact of the intervention
In Table Table2 2 we reported the fit indexes for the three alternative models (see Appendices B1 – B4 for annotated M plus syntaxes for each of these). As hypothesized, Model 2 (see also Figure Figure2) 2 ) was the best fitting model. Trajectories of Prosociality for intervention and control group separately are plotted in Figure Figure3. 3 . The contribution of each group to overall chi-squared values highlighted how the lack of the slope factor in the intervention group results in a substantial misfit. On the contrary, adding a slope factor to control group did not significantly change the overall fit of the model [ Δ χ ( 1 ) 2 = 0.765, p = 0.381]. Of interest, the intercept mean and variance were equal across groups (see Table Table2, 2 , Model 4) suggesting the equivalence of G1 and G2 at T1.
Goodness-of-fit indices for the tested models .
( ) | G1( ) | G2( ) | |||||||
---|---|---|---|---|---|---|---|---|---|
Model 1 (G1 = A; G2 = A) | 16 | 22.826(12) | 18.779(6) | 4.047(6) | 0.981 | 0.981 | 0.085 [0.026,0.138] | 0.081 | 1318.690(9.68) |
Model 3 (G1 = B; G2 = B) | 18 | 10.378(10) | 7.096(5) | 3.282(5) | 0.999 | 0.999 | 0.017 [0.000,0.099] | 0.045 | 1310.242(1.24) |
( ) | G1( ) | G2( ) | (Δ ) of M4 vs. M2 | ||||||
Model 4 | 15 | 13.279(13) | 7.920(6) | 5.359(7) | 1.00 | 1.00 | 0.013 [0.000,0.090] | 0.160 | 2.136(2) |
G1, intervention group; G2, control group; A, no-change model; B, latent change model; NFP, Number of Free Parameters; df, degrees of freedom; χ 2 G1, contribution of G1 to the overall chi-square value; χ 2 G2, contribution of G2 to the overall chi-square value; CFI, Comparative Fit Index; TLI, Tucker-Lewis Index; RMSEA, Root Mean Square Error of Approximation; CI, confidence intervals; SRMR, Standardized Root Mean Square Residual; AIC, Akaike's Information Criterion .
ΔAIC = Difference in AIC between the best fitting model (i.e., Model 2; highlighted in bold) and each model .
Model 4 = Model 2 with mean and variance of intercepts constrained to be equal across groups .
The full Mplus syntaxes for these models were reported in Appendices .
Best fitting Second Order Multiple Group Latent Curve Model with parameter estimates for both groups . Parameters in bold were fixed. This model has parallel indicators (i.e., residual variances of observed indicators are equal within the same latent variable, in each group). All the intercepts of the observed indicators (Y) and endogenous latent variables (η) are fixed to 0 (not reported in figure). G1, intervention group; G2, control group; ξ 1 , intercept of prosociality; ξ 2 , slope of prosociality; η 1 , prosociality at T1; η 2 , prosociality at T2; Y, observed indicator of prosociality; ε, residual variance of observed indicator. n.s. p > 0.05; * p < 0.05; ** p < 0.01; *** p < 0.001.
Trajectories of prosocial behavior for intervention group (G1) and control group (G2) in the best fitting model (Model 2 in Table Table2 2 ) .
In Figure Figure2 2 we reported all the parameters of the best fitting model, for both groups. The slope factor of intervention group has significant variance (φ 2 = 0.28, p < 0.001) and a positive and significant mean (κ 2 = 0.19, p < 0.01). Accordingly, we investigated the presence of the influence of the initial status on the treatment effect by regressing the slope onto the intercept in the intervention group. Note that this latter model has the same fit of Model 2; however, by implementing a slope instead of a covariance, allows to control the effect of the individuals' initial status on their subsequent change. The significant effect of the intercept (i.e., β = –0.62, p < 0.001) on the slope ( R 2 = 0.38) indicated that participants who were less prosocial at the beginning increased steeper in their prosociality after the intervention.
Data collected in intervention programs are often limited to two points in time, namely before and after the delivery of the treatment (i.e., pretest and posttest). When analyzing intervention programs with two waves of data, researchers so far have mostly relied on ANOVA-family techniques which are flawed by requiring strong statistical assumptions and assuming that participants are affected in the same fashion by the intervention. Although a general, average effect of the program is often plausible and theoretically sounded, neglecting individual variability in responding to the treatment delivered can lead to partial or incorrect conclusions. In this article, we illustrated how latent variable models can help overcome these issues and provide the researcher with a clear model-building strategy to evaluate intervention programs based on a pretest-posttest design. To this aim, we outlined a sequence of four steps to be followed which correspond to substantive research questions (e.g., efficacy of the intervention, normative development, etc.). In particular, Model 1, Model 2, and Model 3 included a different combinations of no-change and latent change models in both the intervention and control group (see Table Table2). 2 ). These first three models are crucial to identify the best fitting trajectory of the targeted behavior across the two groups. Next, Model 4 was aimed at ascertaining if the intervention and control group were equivalent on their initial status (both in terms of average starting level and inter-individual differences) or if, vice-versa, this similarity assumption should be relaxed.
Importantly, even if the intervention and control group differ in their initial level, this should not prevent the researcher to investigate the presence of moderation effects—such as a treatment-initial status interaction—if this is in line with the researcher's hypotheses. One of the major advantage of the proposed approach, indeed, is the possibility to model the intervention effect as a random latent variable (i.e., the second-order latent slope) characterized by both a mean (i.e., the average change) and a variance (i.e., the degree of variability around the average effect). As already emphasized by Muthén and Curran ( 1997 ), a statistically significant variance indicates the presence of systematic individual differences in responding to the intervention program. Accordingly, the latent slope identified in the intervention group can be regressed onto the latent intercept in order to examine if participants with different initial values on the targeted behavior were differently affected by the program. Importantly, the analysis of the interaction effects does not need to be limited to the treatment-initial status interaction but can also include other external variables as moderators (e.g., sex, SES, IQ, behavioral problems, etc.; see Caprara et al., 2014 ).
To complement our formal presentation of the LCM procedure, we provided a real data example by re-analyzing the efficacy of the YPA, a universal intervention program aimed to promote prosociality in youths (Zuffianò et al., 2012 ). Our four-step analysis indicated that participants in the intervention group showed a small yet significant increase in their prosociality after 6 months, whereas students in the control group did not show any significant change (see Model 1, Model 2, and Model 3 in Table Table2). 2 ). Furthermore, participants in the intervention and control group did not differ in their initial levels of prosociality (Model 4), thereby ensuring the comparability of the two groups. These results replicated those reported by Zuffianò et al. ( 2012 ) and further attested to the effectiveness of the YPA in promoting prosociality among adolescents. Importantly, our results also indicated that there was a significant variability among participants in responding to the YPA program, as indicated by the significant variance of the latent slope. Accordingly, we explored the possibility of a treatment-initial status interaction. The significant prediction of the slope by the intercept indicated that, after 6 months, those participants showing lower initial levels of prosociality were more responsive to the intervention delivered. On the contrary, participants who were already prosocial at the pretest remained overall stable in their high level of prosociality. Although this effect was not hypothesized a priori , we can speculate that less prosocial participants were more receptive to the content of the program because they appreciated more than their (prosocial) counterparts the discussion about the importance and benefits of prosociality, topics that, very likely, were relatively new for them. However, it is important to remark that the goal of the YPA was to merely sensitize youth to prosocial and empathic values and not to change their actual behaviors. Accordingly, our findings cannot be interpreted as an increase in prosocial conducts among less prosocial participants. Future studies are needed to examine to what extent the introduction of the YPA in more intensive school-based intervention programs (see Caprara et al., 2014 ) could represent a further strength to promote concrete prosocial behaviors.
Limitations and conclusions
Albeit the advantages of the proposed LCM approach, several limitations should be acknowledged. First of all, the use of a second order LCM with two available time points requires that the construct is measured by more than one observed indicators. As such, this technique cannot be used for single-item measures (e.g., Lucas and Donnellan, 2012 ). Second, as any structural equation model, our SO-MG-LCM makes the strong assumption that the specified model should be true in the population. An assumption that is likely to be violated in empirical studies. Moreover, it requires to be empirically identified, and thus an entire set of constraints that leave aside substantive considerations. Third, in this paper, we restricted our attention to the two parallel indicators case to address the more basic situation that a researcher can encounter in the evaluation of a two time-point intervention. Our aim was indeed to confront researchers with the more restrictive case, in terms of model identification. The case in which only two observed indicators are available is indeed, in our opinion, one of the more intimidating for researchers. Moreover, when a scale is composed of a long set of items or the target construct is a second order-construct loaded by two indicators (e.g., as in the case of psychological resilience; see Alessandri et al., 2012 ), and the sample size is not optimal (in terms of the ratio estimated parameters/available subjects) it makes sense to conduct measurement invariance test as a preliminary step, “before” testing the intervention effect, and then use the approach described above to be parsimonious and maximize statistical power. In these circumstances, the interest is indeed on estimating the LCM, and the invariance of indicators likely represent a prerequisite. Measurement invariance issues should never be undervalued by researchers. Instead, they should be routinely evaluated in preliminary research phases, and, when it is possible, incorporated in the measurement model specification phase. Finally, although intervention programs with two time points can still offer useful indications, the use of three (and possibly more) points in time provides the researcher with a stronger evidence to assess the actual efficacy of the program at different follow-up. Hence, the methodology described in this paper should be conceived as a support to take the best of pretest-posttest studies and not as an encouragement to collect only two-wave data. Fourth, SEM techniques usually require the use of relatively larger samples compared to classic ANOVA analyses. Therefore, our procedure may not be suited for the evaluation of intervention programs based on small samples. Although several rules of thumb have been proposed in the past for conducting SEM (e.g., N > 100), we encourage the use of Monte Carlo simulation studies for accurately planning the minimum sample size before starting the data collection (Bandalos and Leite, 2013 ; Wolf et al., 2013 ).
Despite these limitations, we believe that our LCM approach could represent a useful and easy-to-use methodology that should be in the toolbox of psychologists and prevention scientists. Several factors, often uncontrollable, can oblige the researcher to collect data from only two points in time. In front of this (less optimal) scenario, all is not lost and researchers should be aware that more accurate and informative analytical techniques than ANOVA are available to assess intervention programs based on a pretest-posttest design.
Author contributions
GA proposed the research question for the study and the methodological approach, and the focus and style of the manuscript; he contributed substantially to the conception and revision of the manuscript, and wrote the first drafts of all manuscript sections and incorporated revisions based on the suggestions and feedback from AZ and EP. AZ contributed the empirical data set, described the intervention and part of the discussion section, and critically revised the content of the study. EP conducted analyses and revised the style and structure of the manuscript.
The authors thank the students who participated in this study. This research was supported in part by a Research Grant (named: “Progetto di Ateneo”, No. 1081/2016) awarded by Sapienza University of Rome to GA, and by a Mobility Research Grant (No. 4389/2016) awarded by Sapienza University of Rome to EP.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
1 Directed by Leder ( 2000 ).
2 Importantly, although classrooms were randomized across the two conditions (i.e., intervention group and control group), the selection of the four classrooms in each school was not random (i.e., each classroom in school X did not have the same probability to participate in the YPA). In detail, participating classrooms were chosen according to the interest in the project showed by the head teachers.
Supplementary material
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg.2017.00223/full#supplementary-material
- Achenbach T. M. (2017). Future directions for clinical research, services, and training: evidence-based assessment across informants, cultures, and dimensional hierarchies . J. Clin. Child Adolesc. Psychol. 46 , 159–169. 10.1080/15374416.2016.1220315 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Alessandri G., Vecchione M., Caprara G. V., Letzring T. D. (2012). The ego resiliency scale revised: a crosscultural study in Italy, Spain, and the United States . Eur. J. Psychol. Assess. 28 , 139–146. 10.1027/1015-5759/a000102 [ CrossRef ] [ Google Scholar ]
- Bandalos D. L., Leite W. (2013). Use of Monte Carlo studies in structural equation modeling research , in Structural Equation Modeling: A Second Course, 2nd Edn. , eds Hancock G. R., Mueller R. O. (Charlotte, NC: Information Age Publishing; ), 625–666. [ Google Scholar ]
- Bandura A. (1977). Self-efficacy: toward a unifying theory of behavioral change . Psychol. Rev. 84 , 191–215. 10.1037/0033-295X.84.2.191 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Bishop J., Geiser C., Cole D. A. (2015). Modeling latent growth with multiple indicators: a comparison of three approaches . Psychol. Methods 20 , 43–62. 10.1037/met0000018 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Bollen K. A. (1989). Structural Equations with Latent Variables . New York, NY: Wiley. [ Google Scholar ]
- Bollen K. A., Curran P. J. (2006). Latent Curve Models: A Structural Equation Perspective . Hoboken, NJ: Wiley. [ Google Scholar ]
- Brown T. A. (2015). Confirmatory Factor Analysis for Applied Research . New York, NY: The Guilford Press. [ Google Scholar ]
- Burnham K. P., Anderson D. R. (2004). Multimodel inference: understanding AIC and BIC in model selection . Sociol. Methods Res. 33 , 261–304. 10.1177/0049124104268644 [ CrossRef ] [ Google Scholar ]
- Caprara G. V., Alessandri G., Eisenberg N. (2012). Prosociality: the contribution of traits, values, and self-efficacy beliefs . J. Pers. Soc. Psychol. 102 , 1289–1303. 10.1037/a0025626 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Caprara G. V., Luengo Kanacri B. P., Gerbino M., Zuffianò A., Alessandri G., Vecchio G., et al. (2014). Positive effects of promoting prosocial behavior in early adolescents: evidence from a school-based intervention . Int. J. Behav. Dev. 4 , 386–396. 10.1177/0165025414531464 [ CrossRef ] [ Google Scholar ]
- Caprara G. V., Steca P., Zelli A., Capanna C. (2005). A new scale for measuring adults' prosocialness . Eur. J. Psychol. Assess. 21 , 77–89. 10.1027/1015-5759.21.2.77 [ CrossRef ] [ Google Scholar ]
- Cole D. A., Preacher K. J. (2014). Manifest variable path analysis: potentially serious and misleading consequences due to uncorrected measurement error . Psychol. Methods 19 , 300–315. 10.1037/a0033805 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Cook T. D., Campbell D. T. (1979). Quasi-Experimentation: Design & Analysis Issues for Field Settings . Boston, MA: Houghton Mifflin. [ Google Scholar ]
- Cronbach L. J., Snow R. E. (1977). Aptitudes and Instructional Methods: A Handbook for Research on Interactions . New York, NY: Irvington. [ Google Scholar ]
- Curran P. J., Muthén B. O. (1999). The application of latent curve analysis to testing developmental theories in intervention research . Am. J. Commun. Psychol. 27 , 567–595. 10.1023/A:1022137429115 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Curran P. J., West S. G., Finch J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis . Psychol. Methods 1 , 16–29. 10.1037/1082-989X.1.1.16 [ CrossRef ] [ Google Scholar ]
- Eisenberg N., Spinrad T. L., Knafo-Noam A. (2015). Prosocial development , in Handbook of Child Psychology and Developmental Science Vol. 3, 7th Edn. , eds Lamb M. E., Lerner R. M. (Hoboken, NJ: Wiley; ), 610–656. [ Google Scholar ]
- Enders C. K. (2010). Applied Missing Data Analysis . New York, NY: Guilford Press. [ Google Scholar ]
- Geiser C., Keller B. T., Lockhart G. (2013). First-versus second-order latent growth curve models: some insights from latent state-trait theory . Struct. Equ. Modeling 20 , 479–503. 10.1080/10705511.2013.797832 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Greenberg M. T., Domitrovich C., Bumbarger B. (2001). The prevention of mental disorders in school-aged children: current state of the field . Prevent. Treat. 4 :1a 10.1037/1522-3736.4.1.41a [ CrossRef ] [ Google Scholar ]
- Grissom R. J., Kim J. J. (2012). Effect Sizes for Research: Univariate and Multivariate Applications, 2nd Edn . New York, NY: Routledge. [ Google Scholar ]
- Kline R. B. (2016). Principles and Practice of Structural Equation Modeling, 4th Edn . New York, NY: The Guilford Press. [ Google Scholar ]
- Leder M. (Director). (2000). Pay it Forward [Motion Picture]. Burbank, CA: Warner Bros. [ Google Scholar ]
- Little T. D. (1997). Mean and covariance structures (MACS) analyses of cross-cultural data: practical and theoretical issues . Multivariate Behav. Res. 32 , 53–76. 10.1207/s15327906mbr3201_3 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Little T. D. (2013). Longitudinal Structural Equation Modeling . New York, NY: The Guilford Press. [ Google Scholar ]
- Little T. D., Card N. A., Preacher K. J., McConnell E. (2009). Modeling longitudinal data from research on adolescence , in Handbook of Adolescent Psychology, Vol. 2, 3rd Edn. , eds Lerner R. M., Steinberg L. (Hoboken, NJ: Wiley; ), 15–54. [ Google Scholar ]
- Little T. D., Cunningham W. A., Shahar G., Widaman K. F. (2002). To parcel or not to parcel: exploring the question, weighing the merits . Struct. Equ. Modeling 9 , 151–173. 10.1207/S15328007SEM0902_1 [ CrossRef ] [ Google Scholar ]
- Lucas R. E., Donnellan M. B. (2012). Estimating the reliability of single-item life satisfaction measures: results from four national panel studies . Soc. Indic. Res. 105 , 323–331. 10.1007/s11205-011-9783-z [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Malti T., Noam G. G., Beelmann A., Sommer S. (2016). Good Enough? Interventions for child mental health: from adoption to adaptation—from programs to systems . J. Clin. Child Adolesc. Psychol. 45 , 707–709. 10.1080/15374416.2016.1157759 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- McArdle J. J. (2009). Latent variable modeling of differences and changes with longitudinal data . Annu. Rev. Psychol. 60 , 577–605. 10.1146/annurev.psych.60.110707.163612 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Mecklin C. J., Mundfrom D. J. (2005). A Monte Carlo comparison of the Type I and Type II error rates of tests of multivariate normality . J. Stat. Comput. Simul. 75 , 93–107. 10.1080/0094965042000193233 [ CrossRef ] [ Google Scholar ]
- Meredith W., Teresi J. A. (2006). An essay on measurement and factorial invariance . Med. Care 44 , S69–S77. 10.1097/01.mlr.0000245438.73837.89 [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Meredith W., Tisak J. (1990). Latent curve analysis . Psychometrika 55 , 107–122. 10.1007/BF02294746 [ CrossRef ] [ Google Scholar ]
- Micceri T. (1989). The unicorn, the normal curve, and other improbable creatures . Psychol. Bull. 105 , 156–166. 10.1037/0033-2909.105.1.156 [ CrossRef ] [ Google Scholar ]
- Millsap R. E. (2011). Statistical Approaches to Measurement Invariance . New York, NY: Routledge. [ Google Scholar ]
- Muthén B. O., Curran P. J. (1997). General longitudinal modeling of individual differences in experimental designs: a latent variable framework for analysis and power estimation . Psychol. Methods 2 , 371–402. 10.1037/1082-989X.2.4.371 [ CrossRef ] [ Google Scholar ]
- Muthén L. K., Muthén B. O. (1998–2012). Mplus User's Guide, 7th Edn . Los Angeles, CA: Muthen & Muthen. [ Google Scholar ]
- Nimon K. F. (2012). Statistical assumptions of substantive analyses across the general linear model: a mini-review . Front. Psychol. 3 :322. 10.3389/fpsyg.2012.00322 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Roberts M. C., Ilardi S. S. (2003). Handbook of Research Methods in Clinical Psychology . Oxford: Blackwell Publishing. [ Google Scholar ]
- Schmider E., Ziegler M., Danay E., Beyer L., Bühner M. (2010). Is it really robust? Reinvestigating the robustness of ANOVA against violations of the normal distribution assumption . Methodology 6 , 147–151. 10.1027/1614-2241/a000016 [ CrossRef ] [ Google Scholar ]
- Steyer R., Eid M., Schwenkmezger P. (1997). Modeling true intraindividual change: true change as a latent variable . Methods Psychol. Res. Online 2 , 21–33. [ Google Scholar ]
- Tabachnick B. G., Fidell L. S. (2013). Using Multivariate Statistics, 6th Edn . New Jersey, NJ: Pearson. [ Google Scholar ]
- Villasenor Alva J. A., Estrada E. G. (2009). A generalization of Shapiro–Wilk's test for multivariate normality . Commun. Stat. Theor. Methods 38 , 1870–1883. 10.1080/03610920802474465 [ CrossRef ] [ Google Scholar ]
- Wilcox R. R. (1998). The goals and strategies of robust methods . Br. J. Math. Stat. Psychol. 51 , 1–39. 10.1111/j.2044-8317.1998.tb00659.x [ CrossRef ] [ Google Scholar ]
- Wolf E. J., Harrington K. M., Clark S. L., Miller M. W. (2013). Sample size requirements for structural equation models: an evaluation of power, bias, and solution propriety . Educ. Psychol. Meas. 76 , 913–934. 10.1177/0013164413495237 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
- Zuffianò A., Alessandri G., Roche-Olivar R. (2012). Valutazione di un programma di sensibilizzazione prosociale: young prosocial animation [evaluation of a prosocial sensitization program: the young prosocial animation] . Psicol. Educ. 2 , 203–219. [ Google Scholar ]
Separate-Sample Pretest-Posttest Design: An Introduction
The separate-sample pretest-posttest design is a type of quasi-experiment where the outcome of interest is measured 2 times: once before and once after an intervention, each time on a separate group of randomly chosen participants.
The difference between the pretest and posttest measures will estimate the intervention’s effect on the outcome.
The intervention can be:
- A medical treatment
- A training or an exposure to some factor
- A policy change, etc.
Characteristics of the separate-sample pretest-posttest design:
- Data from the pretest and posttest come from different groups.
- Participants are randomly assigned to each group (which makes the outcome of the pretest and posttest comparable).
- All study participants receive the intervention (i.e. there is no control group).
Advantages of the separate-sample pretest-posttest design:
The benefit of using a separate-sample pretest-posttest design is that it avoids some of the most common biases that other quasi-experimental designs suffer from.
Here are some of these advantages:
1. Avoids testing bias
Definition: Testing effect is the influence of the pretest itself on the posttest (regardless of the intervention). Testing bias can happen when the mood, experience, or awareness of the participants taking the pretest is affected, which in turn will affect their posttest outcome.
Example: Asking people about a psychological or family problem in a pretest may affect their mood in a way that influences the posttest. Or when the response rate of the posttest declines after taking a long and time-consuming pretest.
How it is avoided: Perhaps the biggest advantage of using a separate-sample design is that the pretest cannot affect the posttest, since different groups of participants are measured each time.
2. Avoids regression to the mean
Definition: Regression to the mean happens when participants are included in the study based on their extreme pretest scores. The problem will be that their posttest measurement will naturally become less extreme, an effect that can be mistaken for that of the intervention.
Example: Some of the top 10 scorers on a first round in a sport competition will most likely lose their top 10 ranking in the second round, because their performance will naturally regress towards the mean. In simple terms, an extreme score is hard to sustain over time.
How it is avoided: Since the posttest participants are not the same as those measured on the pretest, this rules out their inclusion in the study based on their unusual pretest scores, therefore avoiding the regression problem.
3. Avoids selection bias
Definition: Selection bias happens when compared groups are not similar regarding some basic characteristics, which can offer an alternative explanation of the outcome, and therefore bias the relationship between the intervention and the outcome.
Example: If participants who were less interested in receiving the intervention were somehow more prevalent in the pretest compared to the posttest group, then the outcome of the posttest and the pretest cannot be compared anymore because of selection bias.
How it is avoided: In a separate-sample pretest-posttest design, selection bias can be ruled out as an explanation since the 2 groups were made comparable through randomization.
4. Avoids follow-up
Definition: Loss to follow-up can cause serious problems if participants who were lost to follow-up differ in some important characteristics from those who stayed in the study.
Example: In studies where participants are followed over time, those who did not feel any improvement may not return to follow-up, and therefore the effect of the intervention will be biased high.
How it is avoided: Separate-sample pretest-posttest studies are protected against such effect as each group must be measured only once, and therefore there is no follow-up of participants over time.
Limitations of the separate-sample pretest-posttest design:
For each limitation below, we will discuss how it threats the validity of the study, as well as how to control it by manipulating the design (adding or changing the timing of observations). Statistical techniques can be used to control these limitations, but these will not be discussed here.
Definition: History refers to any event other than the intervention that takes place between the pretest and the posttest and has the potential to affect the outcome of the posttest, therefore biasing the study.
Example: When studying the effect of a medical intervention on weight loss, an outside event such as the launch of a documentary that has the potential to change the diet of the study participants can co-occur with the intervention, and become a source of bias.
How to control it: If the resources are available, repeating the study 2 or more times at different time periods makes History less likely to bias the study, as it would be highly unlikely that every time a biasing event will occur.
2. Maturation
Definition: Maturation is any natural or biological trend that can offer an alternative explanation of the outcome other than the intervention.
Example: Participants growing older in the time period between the pretest and posttest may offer an alternative explanation to an intervention for smoking cessation.
How to control it: Adding another pretest measurement can expose natural trends and thus control for maturation.
3. Mortality
Definition: When the pretest and posttest are separated by a long time period, some participants may become unavailable for the posttest when the time comes. If those differ systematically from those who are still available for the posttest, then the pretest and posttest groups that are no longer comparable.
Example: Over time, patients who become severely sick from a certain medical condition are more likely to be hospitalized and therefore become unavailable for the posttest, creating a source of bias.
How to control it: Taking an additional posttest measurement of the group who received the pretest will eliminate Mortality effects as it provides a measurement of the same type of participants available for the posttest.
4. Instrumentation
Definition: Instrumentation effect refers to changes in the measuring instrument that may account for the observed difference between pretest and posttest results. Note that sometimes the measuring instrument is the researchers themselves who are recording the outcome.
Example: Using 2 interviewers, one for the pretest and another one for the posttest may introduce instrumentation bias as they may have different levels of interest, or different measuring skills that can affect the outcome of interest.
How to control it: Use a group of interviewers randomly assigned to participants.
Finally, adding a control group to the separate-sample pretest-posttest design is highly recommended when possible, as it controls for History, Maturation, Mortality, and Instrumentation at the same time.
Example of a study that used the separate-sample pretest-posttest design:
Lynch and Johnson used a separate-sample pretest-posttest design to evaluate the effect of an educational seminar on 24 medical residents regarding practice-management issues.
A questionnaire that assesses the residents’ knowledge on the subject was used as a pre- and posttest.
The advantages of using a separate-sample pretest-posttest design in this case were:
- Ease of feasibility: Since the questionnaire was time-consuming, giving participants the opportunity to be tested just once was important to get a high response rate (in this case 80%).
- The testing effect was controlled: The participants’ familiarity with the questions asked in the pretest did not affect the posttest.
The study concluded that there was a statistically significant improvement of the knowledge of residents after the seminar.
- Campbell DT, Stanley JC. Experimental and Quasi-Experimental Designs for Research . Wadsworth; 1963.
Further reading
- Experimental vs Quasi-Experimental Design
- Understand Quasi-Experimental Design Through an Example
- One-Group Posttest Only Design
- One-Group Pretest-Posttest Design
- Posttest-Only Control Group Design
- Static Group Comparison Design
- Matched Pairs Design
- Randomized Block Design
Fracturing Processes in Specimens with Internal vs. Throughgoing Flaws: An Experimental Study Using 3D Printed Materials
- Original Paper
- Open access
- Published: 25 September 2024
Cite this article
You have full access to this open access article
- Majed Almubarak ORCID: orcid.org/0000-0002-1864-5275 1 ,
- John T. Germaine 2 &
- Herbert H. Einstein 1
The fracturing behavior and associated mechanical characterization of rocks are important for many applications in the fields of civil, mining, geothermal, and petroleum engineering. Laboratory testing of rocks plays a major role in understanding the underlying processes that occur on the larger scale and for predicting rock behavior. Fracturing research requires well-defined and consistent boundary conditions. Consequently, the testing design and setup can greatly influence the results. In this study, a comprehensive experimental program using an artificial material was carried out to systematically evaluate the effects of different parameters in rock testing under uniaxial compression. The parameters include compression platen type, specimen centering, loading control method, boundary constraints, and flaw parameters. The results show that these testing conditions have a significant effect on the mechanical behavior of rocks. Using a fixed compression platen helped reduce bulging of the material. Centering of the specimen played a critical role to avoid buckling and unequal distribution of stress. Slower displacement rates can control the energy being released once failure occurs to prevent the specimen from exploding. Also, the frictional end effects were investigated by comparing friction-reduced and non-friction-reduced end conditions. Very importantly, the study also identified variations in crack initiation and propagation between specimens with internal flaws and specimens with throughgoing flaws. This investigation showed that wing cracks appeared in specimens with throughgoing flaws, while wing cracks with petal cracks were associated with the internal flaws. It also showed that the mechanical properties are influenced by the inclination of the flaws and established that specimens with internal flaws generally exhibit higher strength compared to specimens with throughgoing flaws. The systematic analysis presented in this work sheds light on important considerations that need to be taken into account when conducting fracture research and adds knowledge to the fundamental understanding of how fractures occur in nature.
Specimen centering is crucial to avoid buckling and unequal distribution of stress during uniaxial compression testing.
Different loading control methods result in different strength properties.
Frictional end effects alter the observed response of the specimen during compressive loading.
Differences in crack initiation and propagation exist between internal flaw and throughgoing flaw specimens.
Mechanical properties are influenced by the inclination of the flaws.
Avoid common mistakes on your manuscript.
1 Introduction
Fractures are the most common structural features found in all types of rocks and tectonic settings. These features influence the deformability, strength, and transport of fluids in rocks, which is vital for many applications including the production of water, hydrocarbons, and geothermal energy. They can also be used to store the energy generated from renewable sources such as wind and solar, as well as contaminants, CO 2 , and hazardous nuclear waste. Hence, investigating fracturing processes can prove essential for many engineering applications. Consequently, extensive research, including laboratory testing, has been conducted in this area. Laboratory testing has played an important role, e.g., in characterizing the strength of rocks and understanding fracture behavior (Xu et al. 2016 ). Numerous efforts have been made to investigate the mechanical properties of fractured rocks and rock-like materials under compressive loading. Many of these studies were done on prismatic specimens of gypsum, marble, granite, and shale (Reyes 1991 ; Bobet 1997 ; Wong 2008 ; Miller 2008 ; Gonçalves da Silva 2009 ; Morgan 2015 ). In these studies, specimens were subjected to uniaxial compressive loading, and then fracture initiation and propagation mechanisms were captured and analyzed.
Despite a considerable amount of experimental and theoretical studies conducted in this field, there is still a lack of knowledge on the exact mechanisms of many fracture-related processes. To address these gaps, two areas of interest were identified to look at in more detail. The first involved identifying and varying critical boundary conditions associated with rock testing. The second pertains to the design of the pre-existing defects, commonly referred to as “flaws,” and the subsequent investigation of how these flaws influence the fracturing processes during rock testing.
In this context, the main objectives of this study are two-fold. The first is to discuss the apparatus, instrumentation, specimen properties, and procedures used to systematically conduct rock fracturing experiments. The parameters investigated in this study include: the type of compression platen used (fixed vs. flexible), specimen centering, loading control method (displacement vs. load), and frictional end effects (fixed vs. lubricated boundaries). The second objective is to conduct a systematic study comparing the fracture processes in rock experiments with internal flaws to those with throughgoing flaws. It is important to understand how these factors affect rock fracturing experiments so that we can choose them carefully and get the most consistent results.
In most fracture research studies, the flaw is fully penetrating (e.g., throughgoing), while natural rock formations often possess internal flaws rather than throughgoing ones. To date, there has not been a comprehensive study employing fully comparable geometries exploring the similarity or difference between fracture processes in rock experiments with throughgoing flaws and those with exclusively internal flaws. Fracture testing specimens with internal flaws has been relatively limited compared to specimens with throughgoing flaws for two main reasons. The first is challenges in monitoring the propagation of internal fractures, which requires the use of transparent, artificial materials or advanced imaging techniques, such as X-ray computed tomography because of natural rock’s opaque nature. The second is the difficulty in producing specimens with controlled internal flaw geometries. Hence, the investigations reported in this work are conducted with an artificial material. A visual representation of these two types of pre-existing flaws is shown below in Fig. 1 .
Schematic of the two types of pre-existing flaws: a internal flaw embedded inside the specimen and b throughgoing flaw fully penetrating the specimen
Several techniques have been used by researchers to create specimens with internal flaws and are summarized below:
Hanging inclusions method. This method, which was used by Dyskin et al. ( 2003 ) and Wang et al. ( 2018 ), involves two greased aluminum foil disks that are suspended at a specific angle to one of the loading axes within an aluminum mold using copper wires or cotton threads to model a penny-shaped flaw at the center of a resin sample. The resin and catalyst are mixed and poured into the mold, and once cured, samples are cut and polished. Although this method provides good transparency for resin samples, a drawback is the residual presence of holding wires in the samples after casting. The sample preparation using other castable materials is similar.
Cutting method. First introduced by Adams and Sines ( 1978 ) and later used by Dyskin et al. ( 2003 ), this method involves creating a penny-shaped flaw by cutting a semi-circular slot into the surface of two sample halves and subsequently gluing them back together. Teflon or greased foil disks are inserted into the slots to ensure contact between the faces. This method is limited in creating multiple initial flaws, as additional cutting planes may introduce instability to the sample and uncertainty in test results, making it suitable only for simple flaw arrangements.
Laser method. Germanovich et al. ( 1994 ) employed a high-energy neodymium-doped laser pulse to introduce internal cracks in non-castable transparent materials like silica glass and PMMA. This method avoids cutting planes and enables the creation of multiple internal flaws but is labor-intensive and requires specialized equipment to achieve the desired flaw arrangement.
Thermal induction method. Dyskin et al. ( 2003 ) used thermal induction to create multiple flaws in resin samples by employing an excessively high catalyst ratio, generating considerable heat and thermal stresses. By heating the curing samples in an oven, multiple internal flaws of varying sizes and densities were produced. The method’s limitations include the inability to produce elaborately designed flaw arrangements and the presence of residual stresses in resin samples after the curing process.
There has been an interest in using artificial model materials for as long as mechanical testing of rocks has been performed. Rapid prototyping (RP), additive manufacturing (AM), and three-dimensional printing (3DP) are three interchangeable terms that define a set of methods for the fast, precise, and repeatable production of elements (Zhou and Zhu 2018 ). The technology is based on the process of joining materials layer by layer to form an object using computer-aided design (CAD). Polymeric, metallic, ceramic, and even complex composite components are among the materials used (Ngo et al. 2018 ). Various 3DP techniques have been developed, including fused deposition manufacturing (FDM), stereolithography (SLA), and selective laser sintering (SLS).
Compared to other 3DP techniques, SLA is one of the earliest and most widely used (Vaezi et al. 2012 ). SLA is also one of the most popular AM techniques for polymeric and ceramic materials, as it produces components with high geometrical precision (Melchels et al. 2010 ). Printing with a special photocurable resin is the basis of the SLA method (Gao et al. 2020 ). The liquid photopolymer resin is cured layer-by-layer using an ultraviolet (UV) laser. After a layer is cured, the build plate adjusts, and the next successive resin layer is cured. The process of the resin being crosslinked under UV light exposure is called photopolymerization (Hamzah et al. 2018 ). This process is depicted in Fig. 2 .
Schematic of SLA 3D printing process
The use of 3D printing technology in this study provides several advantages. One of the most notable benefits is the ability to control the material properties and produce complex geometries that are otherwise difficult to obtain using traditional methods. This advantage is crucial when investigating the effects of different parameters on the material properties as it eliminates sample-to-sample variability (Kong et al. 2018 ). Moreover, this approach leads to more accurate and reproducible results. In this study, clear resin was chosen as the testing material because of its homogeneity and optically transparent nature. These properties make it an ideal choice for investigating the fracture processes and the internal stress fields of the material through photoelasticity (Wang et al. 2017 ). Also, the use of clear resin as the testing material provides insights into the behavior of the material during testing that would otherwise be impossible to observe with opaque materials.
2 Experimental Procedure
2.1 basic material testing and selection.
The specimens used in this study were made from SLA 3D-printed clear resin material, which is a liquid photopolymer. When exposed to UV light, photo-initiators in the photopolymer produce free radicals (Watters and Bernhardt 2018 ). In an exothermic reaction, the free radicals react with monomers and oligomers within the resin to cross-link and create a network of polymer chains as shown in Fig. 3 . More specifically, the photo-initiator molecule breaks down into two parts in response to UV exposure, and the bond that holds it together becomes two highly reactive radicals. The reactive radicals are transferred by the photo-initiator to the active groups on the monomer and oligomer chains, which in turn react with other active groups to form longer chains (Riccio et al. 2022 ). The process is extremely quick, taking only milliseconds.
Photopolymerization scheme of SLA resin
The mechanical properties of this material were measured through uniaxial compression tests on two intact (no flaw) specimens with dimensions of 4 inches in height, 2 inches in length, and 2 inches in width as shown in Fig. 4 a. The specimens were equipped with both axial and lateral extensometers, as depicted in Fig. 4 b, to accurately measure their displacements during the testing process. The tests were run at a constant displacement rate of 1 mm per minute (mm/min).
Details of prismatic intact specimen used to determine the material properties: a rendering of the specimen with dimensions (4 in × 2 in × 2 in), b photograph of the specimen subjected to uniaxial loading with axial and lateral extensometers attached
The elastic constants of the material, Young’s modulus ( E ) and Poisson’s ratio ( v ), are presented in Table 1 .
2.2 Specimen Preparation
For this study, a series of specimens were prepared, each with a prismatic configuration and a height dimension that was twice the width. The most distinct feature of these specimens was the presence of a quasi-elliptical (ovaloid-shaped) flaw at the center of each specimen, as shown in Fig. 5 . The choice of a prismatic configuration and a height dimension twice the width was made to produce a uniform stress distribution across the specimen during the uniaxial compression tests. Including a flaw was also deliberate, making it possible to study the effect of flaws on the mechanical properties and fracture behavior of the material. These specimens, with their unique geometries and flaw specifications, were chosen carefully to make possible a comprehensive investigation of the different parameters and conditions involved in this study.
Example rendering of prismatic specimen used in this study, including the pre-existing vertical flaw with rounded tips. This figure includes a an isometric view and b a front face view, along with a detailed view of the flaw geometry. The example shown has dimensions (3 in × 1.5 in × 1.5 in) and flaw dimensions (0.25 in × 0.02 in). Please note that the flaw can be throughgoing or internal as represented in Fig. 1
All specimens were fabricated using a commercial stereolithography (SLA) 3D printer equipped with a class 1 violet laser source. The laser specifications are: 250 mW power output, 85-micron spot size, and a nominal resolution of 25 microns. The clear photoreactive resin material used in the 3D printing process is composed of a mixture of methacrylated oligomer (comprising between 75 to 90% of the material), methacrylated monomer (between 25 to 50%), and a photo-initiator, diphenyl (2,4,6-trimethylbenzoyl) phosphine oxide (less than 1%). This high-quality resin material, as described by Marin et al. ( 2021 ), has proven to be suitable for creating precise and detailed specimens.
All specimens were 3D printed with the same orientation, with their longitudinal axis—the axis parallel to the longest dimension of each specimen—aligned parallel to the printing platform. The 3D printing scheme is summarized in Fig. 6 . The process starts with designing a 3D CAD model using a suitable modeling software. After designing, the 3D model is saved as an STL file, which is a standard file format used in 3D printing. This STL file is then loaded into a slicing software that is compatible with the 3D printer being used. The slicing software is responsible for dividing the 3D model into many thin layers and creating a g-code that contains position sequences for the 3D printer to build the object layer by layer. The g-code is then sent to the 3D printer, which reads the code and starts the printing process. Once the printing process is complete, the 3D build platform is removed from the printer and the post-processing stage begins.
Summary of SLA 3D printing process: a 3D CAD model generated using a modeling software, b CAD model is converted to an STL file, c printer software digitally slices the model in the STL file into a series of cross-sectional layers, assigns them with printing information (e.g., layer thickness and printing path), and instructs the SLA printer to print, d 3D printing build platform is removed once printing is complete and the post-processing stage begins
The average printing duration was approximately 12 h, for three specimens at a time. After the 3D printing process was completed, the build platform was inserted into an isopropyl alcohol (IPA) bath, which automatically rinsed the specimens for 20 min. This step was crucial in ensuring the complete removal of any residual resin material, which could negatively affect the specimens. Following the rinse, the specimens were allowed to dry at room temperature for six hours. After this period, the specimens were released from the build platform, and the supports were removed. The setup of the 3D printer used in this study is shown in Fig. 7 . All specimens were tested in a green state, which means no further curing post-printing. Furthermore, the time between printing and testing was kept constant for all specimens to minimize any potential aging effects and accidental exposure to UV light, which might cause hardening and crosslinking of the polymers. This procedure was carried out in a similar manner for all the experiments to maintain consistency and provide a controlled environment for comparison.
SLA 3D printer and post-processing setup
2.3 Test Setup
The specimens in this study were subjected to unconfined compression along the axial direction, also known as uniaxial compression. A Baldwin load frame, which is a hydraulic loading machine, was used for this purpose. The machine has a maximum loading capacity of 60 kips and an 8-inch stroke, allowing for sufficient force to be applied on the specimens during testing. The testing setup, as shown in the photograph in Fig. 8 , also included a high-resolution and a high-speed camera to track the fracturing processes during the tests. The load, axial displacement, and time data were recorded for all the experiments, and the data collection was synchronized with the image frames from the high-resolution and high-speed cameras.
Photograph of the test setup showing the different equipment involved for uniaxial compression and data acquisition
The crack events were recorded using a high-resolution camera that takes images at 1 s intervals throughout the entire duration of the test while the high-speed camera captures only 2.5 s of the test but at a much faster rate of 5000 frames per second. Both cameras were connected to a data acquisition system that facilitated the synchronization of the data (post-processing) with time against the stress and strain values obtained from the Baldwin load frame. This coordination of the cameras with the Baldwin load frame made it possible to accurately relate the images with their corresponding stress and strain values, resulting in a more comprehensive understanding of the test results. A schematic of the test setup depicting the cameras and the data acquisition system, along with the Baldwin load frame is shown in Fig. 9 .
Schematic of uniaxial compression tests on a prismatic specimen. The central data acquisition system saves vertical load and displacement data from the load frame, as well as high-resolution and high-speed camera images, to relate observed events to stress–strain-time data
2.4 Test Program
The testing program consisted of a seating stage, in which the specimens were preloaded to a target of 100 lbs at a rate of 0.1 inches per minute (in/min). This step was crucial to ensure the specimens were in contact with the loading platen and under a compressive load before the main testing phase. This also minimizes any potential errors that may arise because of improper contact between the specimen and platen. After the seating stage, the specimens were loaded at a specified rate, until the end of the test. The Baldwin load frame has a maximum capacity of 60 kips and was programmed to stop the test once the load capacity was reached or when a 2% load drop was detected. The test was manually stopped in cases where a significant amount of crack propagation occurred to avoid specimen failure and to preserve the specimen, when possible. A load vs. time plot detailing the preloading, testing, and stopping phases as programmed in the experiments is shown in Fig. 10 .
Load versus time plot that depicts the preloading, testing, and stopping phases as programmed in the experiments
3 Results and Discussion
In this section, the investigated parameters are discussed in detail. The results are presented in the form of true stress plotted against strain for the uniaxial experiments. The true stress is determined by dividing the load by the instantaneous cross-sectional area. Relevant parameters such as yield stress and ultimate compressive stress are also compared. Yield stress is defined as the value above which the material begins to deform plastically and is determined using the 0.2% offset method. This method involves constructing a line parallel to the initial (linear) portion of the stress–strain curve but offset by 0.2% from the origin on the strain axis and identifying the point of intersection with the curve. Ultimate compressive stress was defined as the peak stress recorded. Visual observations of the specimens throughout the tests are also discussed. An example stress–strain plot is shown in Fig. 11 , demonstrating the locations of the yield stress and ultimate compressive stress.
Example stress–strain curve for specimen subjected to uniaxial loading with yield stress (orange triangle) and ultimate compressive stress (gray diamond) markers
3.1 Compression Platen and Centering
Compression platens are flat plates or disks used to apply a compressive force on a specimen. There are many types available, with the two main types being spherically seated (flexible) and fixed platens. An image of these platens is shown in Fig. 12 .
Photograph of flexible (left) and fixed (right) compression platens
Using a flexible platen helps a specimen self-align during compression testing and is required for many ASTM tests for different materials, including concrete and wood. However, from initial experiments conducted, it was noticed that the plate tilts at high loads. This leads the specimen to experience uneven loading and buckling, which eventually causes the specimen to burst in failure. It was found that the cause of this behavior could be because of uncentering, the specimen not being perfectly aligned with the compression platen before starting the test. To address this issue, an acrylic centering template was fabricated to ensure the proper centering of the specimen.
A high-power laser beam was utilized to accurately cut a circular, 0.5-inch thick, centering template from acrylic material with a diameter similar to that of the compression platen. A hole in the middle with dimensions equivalent to the base of the specimen to be tested was also included. The use of laser cutting technology allowed for precise and efficient fabrication of the centering template. A side view schematic and top view image of the centering template are shown in Fig. 13 .
a Side view schematic of centering template and b top view photograph of the centering template used to align the specimens before testing
The centering template is placed before the experiment as shown in Fig. 14 . It is important to note that the centering template is removed by raising the load frame before the test is started.
a Overview photograph of the centered specimen under uniaxial load, b close-up view of centering template placed on the compression platen, and c specimen aligned using the centering template
The effect of compression platen type and the specimen centering was investigated by conducting uniaxial tests on four identical prismatic specimens with dimensions (3 in × 1.5 in × 1.5 in). For each compression platen type—fixed and flexible—tests were conducted on both a centered and an uncentered specimen. An uncentered specimen is defined as having its center located 0.75 inches away from the center of the loading platen. Given that the specimens were 1.5 inches wide, this offset represents a displacement equal to half the width of the specimen. Figure 15 shows the stress–strain data for the four tests at a displacement-controlled loading rate of 1 mm/min.
Stress–strain data comparing specimens subjected to uniaxial loading with different compression platen types and specimen centering
The yield stress and ultimate compressive stress values were determined from the stress–strain plots and are shown below in Fig. 16 .
Yield stress and ultimate compressive stress values for specimens subjected to uniaxial loading with different compression platen types and specimen centering
Evidently, little difference exists in the mechanical properties between the specimens tested with the fixed compression platen regardless of centering. However, it was observed that the specimen that was not centered experienced more bulging, which could lead to eventual buckling and explosion at higher loading. Images of the specimens subjected to uniaxial loading with the fixed compression platen are shown in Fig. 17 .
Frames of specimens subjected to uniaxial loading using the fixed compression platen. Uncentered specimen at: a start of the test (low loading) and b end of the test (high loading). Centered specimen at: c start of the test (low loading) and d end of the test (high loading)
For the flexible compression platen tests, the uncentered specimen clearly experienced unequal distribution of load which caused it to deform on one side more than the other. This effect became more visually apparent at higher loads but is still present from the beginning of the test. Images of the specimens subjected to uniaxial loading with the flexible compression platen are shown in Fig. 18 .
Frames of specimens subjected to uniaxial loading using the flexible compression platen. Uncentered specimen at: a start of the test (low loading) and b end of the test (high loading). Centered specimen at: c start of the test (low loading) and d end of the test (high loading)
The fixed compression platen along with the centering template were used for the rest of the experiments in this study. This was done to maintain consistency and provide a reliable comparison of the tests.
3.2 Loading Control
Two of the most common loading control methods are displacement-controlled (mm/min) and load-controlled (lb/min). There is no set loading control method and rate for uniaxial compression tests, but recommendations and suggestions exist (Isah et al. 2020 ; ISRM 2007 ; ASTM 2017 ) based on the time to failure and the size of the specimen. This can be problematic because different rocks behave differently under loading.
The effect of the loading control method and rate was investigated by conducting uniaxial tests on six identical prismatic specimens with dimensions (3 in × 1.5 in × 1.5 in). Figure 19 shows the stress–strain data for the six tests at displacement-controlled and load-controlled loadings at different rates.
Stress–strain data comparing specimens subjected to uniaxial loading at different displacement (solid curves) and loading (dotted curves) rates
The yield stress and ultimate compressive stress values were determined from the stress–strain plots and are shown in Fig. 20 . There are no ultimate compressive stress values for the specimens tested with load control because the material did not exhibit strain softening and thus no peak stress could be determined.
Yield stress and ultimate compressive stress values for specimens subjected to uniaxial loading at different displacement and loading rates
The results above agree with the literature (Komurlu 2018 ), indicating that the mechanical properties of rocks depend on the loading control method and rate applied. It is evident that the increase in rate increases the yield stress and ultimate compressive stress values (where applicable). For the load control method, strain increased slowly in the beginning (small stress) of the test but increased rapidly at higher stress. Displacement-based loading was more controlled and produced more stable fractures (e.g., cracks that steadily increase in length without any sudden jumps or branching). Hence, it was used for the rest of the experiments in this study at a rate of 1 mm/min.
3.3 Boundary Conditions (Frictional End Effects)
The existence of a uniform stress state on loading surfaces, which are usually the principal planes, is a basic assumption in rock testing (Labuz and Bridell 1993 ). During compressive loading, however, this condition does not exactly exist because a frictional constraint develops at the interface between the specimen and the loading system. Consequently, the observed response of the specimen may be affected by unknown boundary conditions. Hence, it is important to understand these conditions.
Unconstrained deformation can be achieved by finding an appropriate friction reducer. The friction reducers selected for testing in this study are: silicone-based lubricant, parchment paper, and parafilm. They were applied at both the top and bottom boundaries of the specimen. Figure 21 shows the stress–strain curves of the tests with different boundary conditions. The no friction reducer (fixed-end) specimen is also included as a base case for comparison. All four tests were run on identical prismatic specimens with dimensions (3 in × 1.5 in × 1.5 in) at a displacement-controlled loading rate of 1 mm/min.
Stress–strain data comparing specimens subjected to uniaxial loading with different boundary conditions
The yield stress and ultimate stress values were determined from the stress–strain plots and are shown below in Fig. 22 .
Yield stress and ultimate compressive stress values for specimens subjected to uniaxial loading with different boundary conditions
The results indicate that the specimen with parafilm had the highest yield stress and ultimate compressive stress values, with the parchment and lubricant specimens in between, and the specimen with no friction reducer having the lowest values.
Even with a friction reducer, barreling was evident. This is likely due to frictional constraints development (there may be some friction effects still present). However, these constraints were slightly reduced for the specimens tested with a friction reducer. A sketch of each specimen at 33% strain along with the corresponding image frame is shown in Fig. 23 . It is worth noting that, in addition to the distortion, there is also a distinct difference in the compression behavior observed. More specifically, the specimen with no friction reducer and the specimen with parchment displayed symmetry during compression, whereas the specimen with lubricant and the specimen with parafilm exhibited asymmetry. Further investigation into other friction reducers could prove useful.
Sketch alongside the image frame of specimens subjected to uniaxial loading at 33% strain with: a no friction reducer, b lubricant, c parchment, and d parafilm
3.4 Flaw Parameters
Specimens can be produced with different flaw specifications using 3D printing technology. A series of experiments are presented in this section comparing the fracturing behavior of specimens containing internal flaws with specimens containing throughgoing flaws at several flaw inclinations. Identical pre-existing quasi-elliptical (ovaloid-shaped) flaws with dimensions (0.5 in × 0.03 in) were placed at the center of each specimen. A schematic of specimens with the two types of pre-existing flaws is again shown in Fig. 24 .
Schematic of the two types of pre-existing flaws in the specimens prepared: a internal flaw and b throughgoing flaw
3.4.1 Internal Flaws
The effect of flaw inclination on specimens with internal flaws was investigated by conducting uniaxial tests on five prismatic specimens with dimensions (3 in × 1.5 in × 1.5 in), each having a different flaw inclination. The flaw inclinations, relative to the horizontal axis of the specimens, were: 0°, 30°, 45°, 60°, and 90°. Figure 25 shows the stress–strain data for the five tests at a displacement-controlled loading rate of 1 mm/min.
Stress–strain data comparing specimens with internal flaws at different flaw inclinations subjected to uniaxial loading
The yield stress and ultimate compressive stress values were determined from the stress–strain plots and are shown below in Fig. 26 .
Yield stress and ultimate compressive stress values for specimens with internal flaws at different flaw inclinations subjected to uniaxial loading
Evidently, the flaw inclination influenced the yield stress and ultimate compressive stress values. This may be attributed to the stress distribution around the flaw changing with the flaw orientation. The results indicate that the specimen with a 0° (horizontal) flaw had the highest yield stress and ultimate compressive stress values, followed by the specimen with a 90° (vertical) flaw in between, and the specimens with intermediate flaw inclinations (30°, 45°, and 60°) having the lowest values.
The cracks propagated from the flaw were different for each flaw inclination. For the 0° (horizontal) flaw specimen, a crack developed in the direction of the diagonal axis of the specimen. For the inclined flaws, wing cracks developed but, interestingly, propagated in a curved trajectory around the flaw. Petal cracks also initiated at the tips of these flaws, curving around and eventually wrapping around the flaw, forming patterns associated with the wing cracks. For the 90° (vertical) flaw, a crack propagated in the vertical direction along the axis of compression, as was expected. Image frames of the five specimens with internal flaws taken at the beginning of the test (labeled \({t}_{\text{initial}}\) ) and at the end of the test (labeled \({t}_{\text{final}}\) ) are shown in Fig. 27 .
Image frames of specimens with internal flaws at different inclinations subjected to uniaxial loading, where: a 0°, 30°, 45°, 60°, and 90°, i) \({t}_{\text{initial}}\) , and ii) \({t}_{\text{final}}\)
3.4.2 Throughgoing Flaws
The effect of flaw inclination on specimens with throughgoing flaws was investigated by conducting uniaxial tests on five prismatic specimens with dimensions (3 in × 1.5 in × 1.5 in), each having a different flaw inclination. The flaw inclinations, relative to the horizontal axis of the specimens, were: 0°, 30°, 45°, 60°, and 90°. Figure 28 shows the stress–strain data for the five tests at a displacement-controlled loading rate of 1 mm/min.
Stress–strain data comparing specimens with throughgoing flaws at different flaw inclinations subjected to uniaxial loading
The yield stress and ultimate compressive stress values were determined from the stress–strain plots and are shown below in Fig. 29 .
Yield stress and ultimate compressive stress values for specimens with throughgoing flaws at different flaw inclinations subjected to uniaxial loading
The results followed a similar trend as that observed for the internal flaw specimens. The specimen with a 0° (horizontal) flaw had the highest yield stress and ultimate compressive stress values, followed by the specimen with a 90° (vertical) flaw in between, and the specimens with intermediate flaw inclinations (30°, 45°, and 60°) having the lowest values. However, in this case, the 45° inclination had considerably lower yield stress and ultimate compressive stress values among them.
The cracks propagated from the throughgoing flaws showed distinct patterns at different flaw inclinations. For the 0° (horizontal) flaw, a vertical crack developed near the center of the flaw. For the 30° inclined flaw, wing cracks and several branching cracks developed. For the 45° inclined flaw, a more prominent or well-defined wing crack was observed propagating from the tip of the flaw along the axis of compression. For the 60° inclined flaw, cracks similar to those from the 30° inclined flaw were observed but they were at a steeper angle. For the 90° (vertical) flaw, cracks initiated at the flaw edges and propagated vertically. Image frames of the five specimens with throughgoing flaws taken at the beginning of the test (labeled \({t}_{\text{initial}}\) ) and at the end of the test (labeled \({t}_{\text{final}}\) ) are shown in Fig. 30 .
Image frames of specimens with throughgoing flaws at different inclinations subjected to uniaxial loading, where: a 0°, 30°, 45°, 60°, and 90°, i) \({t}_{\text{initial}}\) , and ii) \({t}_{\text{final}}\)
3.4.3 Comparison of Internal Flaws and Throughgoing Flaws
The yield stress and ultimate compressive stress values for specimens with internal flaws and specimens with throughgoing flaws at the different flaw inclinations are shown below in Fig. 31 .
Yield stress (YS) and ultimate compressive stress (UCS) values of specimens with internal flaws (solid curves) and specimens with throughgoing flaws (dotted curves) at different flaw inclinations
The results show that specimens with internal flaws generally exhibit higher yield stress and ultimate compressive stress values than those with throughgoing flaws. A possible explanation for this is that for the specimens with throughgoing flaws, the remaining intact area is smaller than that for the internal flaw specimens. Hence, the stress concentration on the intact area for the throughgoing flaw specimens is higher than the stress concentration for the internal flaw specimens, leading to lower yield strength and ultimate compressive strength values.
Crack initiation and propagation behavior were observed to be different between specimens with internal flaws and specimens with throughgoing flaws. Especially when looking at the inclined (30°, 45°, and 60°) flaw specimens, tensile wing cracks appeared in specimens with throughgoing flaws, while wing cracks with petal cracks were associated with the internal flaws. For throughgoing flaws, the cracks formed from the tips of the flaw and grew in the direction of loading (towards the end of the specimen). On the other hand, for internal flaws, the cracks formed at the tips of the flaw and wrapped around the flaw and formed petal cracks. A similar wrapping effect was observed by Dyskin et al. ( 1994 ) and may be a result of principal stresses near the pre-existing flaw acting in a radial direction (Germanovich et al. 1994 ). A front-facing view of the specimen images with 45° internal and throughgoing flaws, taken after the experiment and illustrating the difference observed in crack behavior, is shown in Fig. 32 .
Images taken of specimens with 45° a internal and b throughgoing flaws after the experiment showing the difference in crack behavior
A 3D model was developed for the same specimens, with 45° internal and throughgoing flaws, after the experiment and is shown in Fig. 33 .
Screenshots of 3D model of specimens with 45° a internal and b throughgoing flaws after the experiment, showing the difference in crack behavior from the front face and isometric views, respectively. The blue color represents the initial flaw, the green color represents the primary wing cracks, and the yellow color represents the petal cracks
4 Summary and Conclusions
This study achieved two main objectives. The first objective was a detailed investigation of the boundary effects. This was done by detailing the apparatus, instrumentation, specimen properties, and procedures used in systematically conducting rock fracturing experiments, focusing on compression platen type (fixed vs. flexible), specimen centering, loading control method (displacement vs. load), and frictional end effects (fixed vs. lubricated boundaries). The second objective was to conduct a comprehensive study comparing the fracture processes in experiments with internal flaws and throughgoing flaws at different flaw inclinations. All the tests were conducted on 3D-printed artificial material specimens.
The results lead to the following conclusions:
Objective 1:
Comparing the compression platen types showed mostly an increase in ultimate compressive stress and yield stress values when flexible platens were used. Moreover, using a fixed compression platen helps reduce the bulging of the specimen.
Centering the specimen is crucial for uniform deformation and reduced bulging.
Varying loading control methods resulted in different strength properties. Specifically, an increase in rate increased the yield stress and ultimate compressive stress values for both load-controlled and displacement-controlled loadings.
Adding friction reducers at the boundary generally increases strength, though barreling may still occur.
Objective 2:
Differences in crack initiation and propagation exist between internal flaw and throughgoing flaw specimens. This investigation showed that wing cracks appeared in specimens with throughgoing flaws, while wing cracks with petal cracks were associated with the internal flaws. It also showed that the mechanical properties are influenced by the inclination of the flaws and established that specimens with internal flaws generally exhibit higher strength compared to specimens with throughgoing flaws.
Looking at these different experimental conditions has helped us to form a better understanding of how they affect rock fracturing experiments. Testing design and conditions can greatly influence the results. We now know that they matter, so we need to choose them carefully when we test in the future.
Data availability
Not applicable.
Adams M, Sines G (1978) Crack extension from flaws in a brittle material subjected to compression. Tectonophysics 49(1–2):97–118
Article Google Scholar
American Society for Testing and Materials (2017) Standard test methods for compressive strength and elastic moduli of intact rock core specimens under varying states of stress and temperatures. ASTM International. https://www.astm.org/d7012-14e01.html
Bobet A (1997) Fracture coalescence in rock materials: experimental observations and numerical predictions. Doctoral dissertation, Massachusetts Institute of Technology
Dyskin AV, Jewell RJ, Joer H, Sahouryeh E, Ustinov KB (1994) Experiments on 3-D crack growth in uniaxial compression. Int J Fract 65(4):R77–R83
Dyskin AV, Sahouryeh E, Jewell RJ, Joer H, Ustinov KB (2003) Influence of shape and locations of initial 3-D cracks on their growth in uniaxial compression. Eng Fract Mech 70(15):2115–2136
Gao Y, Wu T, Zhou Y (2020) Application and prospective of 3D printing in rock mechanics: a review. Int J Miner Metall Mater 28(1):1–17
Article CAS Google Scholar
Germanovich LN, Salganik RL, Dyskin AV, Lee KK (1994) Mechanisms of brittle fracture of rock with pre-existing cracks in compression. Pure Appl Geophys 143(1–3):117–149
Gonçalves da Silva BM (2009) Modeling of crack initiation, propagation, and coalescence in rocks. Master’s thesis, Massachusetts Institute of Technology
Hamzah HH, Shafiee SA, Abdalla A, Patel BA (2018) 3D printable conductive materials for the fabrication of electrochemical sensors: a mini review. Electrochem Commun 96:27–31
Isah BW, Mohamad H, Ahmad NR, Harahap ISH, Al-Bared MAM (2020) Uniaxial compression test of rocks: review of strain measuring instruments. IOP Conf Ser Earth Environ Sci 476(1):012039
ISRM (2007) The complete ISRM suggested methods for rock characterization, testing and monitoring: 1974–2006, 1st edn. In: Ulusay R, Hudson J (eds) Ankara
Komurlu E (2018) Loading rate conditions and specimen size effect on strength and deformability of rock materials under uniaxial compression. Int J Geo Eng 9(1)
Kong L, Ostadhassan M, Li C, Tamimi N (2018) Can 3-D printed gypsum samples replicate natural rocks? An experimental study. Rock Mech Rock Eng 51(10):3061–3074
Labuz JF, Bridell JM (1993) Reducing frictional constraint in compression testing through lubrication. Int J Rock Mech Mining Sci Geom Absts 30(4):451–455
Marin E, Boschetto F, Zanocco M, Doan HN, Sunthar TPM, Kinashi K, Iba D, Zhu W, Pezzotti G (2021) UV-curing and thermal ageing of methacrylated stereo-lithographic resin. Polym Degrad Stab 185:109503
Melchels FPW, Feijen J, Grijpma DW (2010) A review on stereolithography and its applications in biomedical engineering. Biomaterials 31(24):6121–6130
Miller JT (2008) Crack coalescence in granite. Master’s thesis, Massachusetts Institute of Technology
Morgan SP (2015) An experimental and numerical study on the fracturing processes in Opalinus shale. Doctoral dissertation, Massachusetts Institute of Technology
Ngo TD, Kashani A, Imbalzano G, Nguyen KTQ, Hui D (2018) Additive manufacturing (3D printing): a review of materials, methods, applications and challenges. Compos B Eng 143:172–196
Reyes OML (1991) Experimental study and analytical modelling of compressive fracture in brittle materials. Doctoral dissertation, Massachusetts Institute of Technology
Riccio C, Civera M, Grimaldo Ruiz O, Pedullà P, Rodriguez Reinoso M, Tommasi G, Vollaro M, Burgio V, Surace C (2022) Effects of curing on photosensitive resins in SLA additive manufacturing. Appl Mech 2021(2):942–955
Google Scholar
Vaezi M, Seitz H, Yang S (2012) A review on 3D micro-additive manufacturing technologies. Int J Adv Manuf 67(5–8):1721–1754
Wang H, Dyskin A, Pasternak E, Dight P, Sarmadivaleh M (2018) Effect of the intermediate principal stress on 3-D crack growth. Eng Fract Mech 204:404–420
Wang L, Ju Y, Xie H, Ma G, Mao L, He K (2017) The mechanical and photoelastic properties of 3D printable stress-visualized materials. Sci Rep 7(1):10918
Watters MP, Bernhardt ML (2018) Curing parameters to improve the mechanical properties of stereolithographic printed specimens. Rapid Prototyp J 24(1):46–51
Wong LNY (2008) Crack coalescence in molded gypsum and Carrara marble. Doctoral dissertation, Massachusetts Institute of Technology
Xu H, Zhou W, Xie R, Da L, Xiao C, Shan Y, Zhang H (2016) Characterization of rock mechanical properties using lab tests and numerical interpretation model of well logs. Math Probl Eng 2016:1–13
Zhou T, Zhu JB (2018) Identification of a suitable 3D printing material for mimicking brittle and hard rocks and its brittleness enhancements. Rock Mech Rock Eng 51(3):765–777
Download references
Acknowledgements
The authors would like to thank the MIT Rock Mechanics Group for continuous discussion regarding this topic. In addition, we would like to extend special thanks to Magreth Kakoko for support with the CAD designs and specimen preparation procedures.
'Open Access funding provided by the MIT Libraries'. No external funding was used.
Author information
Authors and affiliations.
Massachusetts Institute of Technology, Cambridge, MA, USA
Majed Almubarak & Herbert H. Einstein
Tufts University, Medford, MA, USA
John T. Germaine
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Majed Almubarak .
Ethics declarations
Conflict of interest.
The authors declare no conflict of interest.
Additional information
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
About this article
Almubarak, M., Germaine, J.T. & Einstein, H.H. Fracturing Processes in Specimens with Internal vs. Throughgoing Flaws: An Experimental Study Using 3D Printed Materials. Rock Mech Rock Eng (2024). https://doi.org/10.1007/s00603-024-04168-y
Download citation
Received : 30 May 2024
Accepted : 06 September 2024
Published : 25 September 2024
DOI : https://doi.org/10.1007/s00603-024-04168-y
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Uniaxial compression
- Rock testing
- 3D printing
- Mechanical properties
- Find a journal
- Publish with us
- Track your research
IMAGES
VIDEO
COMMENTS
A pretest-posttest design is an experiment in which measurements are taken on individuals both before and after they're involved in some treatment. Pretest-posttest designs can be used in both experimental and quasi-experimental research and may or may not include control groups. The process for each research approach is as follows:
The true experimental design carries out the pre-test and post-test on both the treatment group as well as a control group. whereas in pre-experimental design, control group and pre-test are options. it does not always have the presence of those two and helps the researcher determine how the real experiment is going to happen.
Pre-experiments are the simplest form of research design. In a pre-experiment either a single group or multiple groups are observed subsequent to some agent or treatment presumed to cause change. ... or an artifact of the testing. Even when pre-experimental designs identify a comparison group, it is still difficult to dismiss rival hypotheses ...
Both groups are pre-tested, and both are post-tested, the ultimate difference being that one group was administered the treatment. This test allows a number of distinct analyses, giving researchers the tools to filter out experimental noise and confounding variables. The internal validity of this design is strong, because the pretest ensures ...
A pretest-posttest experimental design is a quasi-experimental approach, which means the aim of the approach is to establish a cause-and-effect relationship. What is an example of pretest and ...
More than 50 years ago, Donald Campbell and Julian Stanley (1963) carefully explained why the one-group pretest-posttest pre-experimental design (Y 1 X Y 2) was a very poor choice for testing the effect of an independent variable X on a dependent variable Y that is measured at Time 1 and Time 2.The reasons ranged from obvious matters such as the absence of a control group to technical ...
The pretest-posttest control group design, also called the pretest-posttest randomized experimental design, is a type of experiment where participants get randomly assigned to either receive an intervention (the treatment group) or not (the control group).The outcome of interest is measured 2 times, once before the treatment group gets the intervention — the pretest — and once after it ...
As the name suggests, this type of pre-experimental design involves measurement only after an intervention. In fact, sometimes it is called the after-only design. As in other pre-experimental designs, there is no comparison or control group; everyone receives the intervention (Figure 14.9). Figure 14.9 One group posttest-only design.
The basic premise behind the pretest-posttest design involves obtaining a pretest measure of the outcome of interest prior to administering some treatment, followed by a posttest on the same measure after treatment occurs. Pretest-posttest designs are employed in both experimental and quasi-experimental research and can be used with or ...
Design 1: Randomized control-group pretest-posttest design With this RD, all conditions are the same for both the experimental and control groups, with the excep-tion that the experimental group is exposed to a treat-ment, T, whereas the control group is not. Maturation and history are major problems for internal validity in
Their designs can also include a pre-test and can have more than two groups, but these are the minimum requirements for a design to be a true experiment. ... which is the post-test. In a classic experimental design, participants are also given a pretest to measure the dependent variable before the experimental treatment begins. Types of ...
True experimental designs include: -pre-test/post-test control group design. -Solomon four-group design. -post-test only control group design. Research Methodology concerns how the design is implemented, how the research is carried out. The methodology employed often determines the quality of the data set produced.
Pre-experimental designs - a variation of experimental design that lacks the rigor of experiments and is often used before a true experiment is conducted. Quasi-experimental design - designs lack random assignment to experimental and control groups. Static group design - uses an experimental group and a comparison group, without random ...
True experimental designs include: Pre-test/Post-test control group design This is also called the classic controlled experimental design, and the randomized pre-test/post-test design because it: 1) Controls the assignment of subjects to experimental (treatment) and control groups through the use of a table of random numbers.
Pre-experiments are the simplest form of research design. In a pre-experiment either a single group or multiple groups are observed subsequent to some agent or treatment presumed to cause change. ... or an artifact of the testing. Even when pre-experimental designs identify a comparison group, it is still difficult to dismiss rival hypotheses ...
These methods are called pre-experimental designs. Tightly controlled studies done in laboratory or special treatment settings are known as efficacy studies, and are used to demonstrate if a given treatment can produce positive results under ideal conditions. Outcome studies done with more clinically representative clients and therapists, in ...
Design. The study followed a quasi-experimental design, with both the intervention and control groups assessed at two different time points: Before (Time 1) YPA intervention and 6 months after (Time 2). Twelve classrooms from three schools (one middle school and two high schools) participated in the study during the school year 2008-2009.
Of these evaluation studies, about half used an experimental design. However, a quarter used a single group, pre-post test design, and researchers using these designs did not mention possible RTM effects in their explanations, although other explanatory factors were mentioned.
3. Testing. The testing effect is the influence of the pretest itself on the outcome of the posttest. This happens when just taking the pretest increases the experience, knowledge, or awareness of participants which changes their posttest results (this change will occur irrespective of the intervention).
An example of quasi-experimental design is the testing of a new mass-casualty triage system by selecting a group of Emergency Medical Services (EMS) personnel and first having the group participate in a pre-test session based on triage scenarios, participate in a training for a new triage method, and then compare post-test results with pre-test ...
The separate-sample pretest-posttest design is a type of quasi-experiment where the outcome of interest is measured 2 times: once before and once after an intervention, each time on a separate group of randomly chosen participants. The difference between the pretest and posttest measures will estimate the intervention's effect on the outcome.
Of these evaluation studies, about half used an experimental design. However, a quarter used a single group, pre-post test design, and researchers using these designs did not mention possible RTM ...
In summary, quasi-experimental design has been a common research method used for centuries. Pre-test and post-test design is a form of quasi-experimental research that allows for uncompli-cated assessment of an intervention applied to a group of study participants.Validityofpre-testandpost-teststudiesisdifficult
2.2 Experimental Setup. To experimentally investigate the various process and product influences, a custom filling test stand was designed and constructed, as depicted in Figure 1. The test stand was placed in the dry room of the BLB with an ambient temperature of 20 °C, a dew point of −40 °C, and an ambient pressure of 988 mbar.
This comprehensive meta-analysis aimed to assess the effectiveness of digital interventions in improving developmental skills for children and adolescents with autism spectrum disorder (ASD). We conducted a systematic literature search based on three databases. A pre-test adjusted between-group standardized effect size was computed for effect size synthesis. We utilized a robust variance ...
The fracturing behavior and associated mechanical characterization of rocks are important for many applications in the fields of civil, mining, geothermal, and petroleum engineering. Laboratory testing of rocks plays a major role in understanding the underlying processes that occur on the larger scale and for predicting rock behavior. Fracturing research requires well-defined and consistent ...