How the internet changes us and our science

In recent years web-based scientific research is expanding and reinventing itself constantly. Publications and research articles in the Journal of Personality and Social Psychology conducted via web-based tools have relatively increased by about 543% from 2008 to 2009 (Denissen, Neumann, van Zalk, 2010).

With almost near-universal internet access in most of the developed world (e.g. 90 % of Sweden’s population has daily access to the internet as the Internet World Stats report 2001 to 2009 shows), the newest technology does not only affect us on a daily basis, but also shapes our daily social interactions and the way in which we conduct research. In addition to psychological offline data collection via questionnaires and experiments for instance, web-based research through online surveys, apps and special web applications is able to facilitate and amplify our scientific data collection.

Therefore, making use of these new technological opportunities, research in psychology and other humanity sciences has become more virtual and online based. We collect data about us and the world around us online, answer questionnaires on our phones while traveling home or participate in diary studies before going to bed.

Online web-based data collection offers many advantages to scientific research. Most importantly:

  1. Data can be collected more easily and economically.
  2. Entered data can be validated in real time and the user can be prompted for correction.
  3. Data anonymity can be guaranteed if researchers assure the anonymous and separate storage of participants’ answers and their ID codes.
  4. Researchers can reach a more representative sample much easier, especially if distributing their surveys via various social media platforms.

In their brilliant article on “How the internet is changing the implementation of traditional research methods, people’s daily lives, and the way in which developmental scientists conduct research” Denissen, Neumann and van Zalk (2010) explain chances and challenges the new generation of online research provides. They explain why web-based research has risen to such popularity in the past decade and what is needed to conduct it.

The authors do not avoid the challenges of these new possibilities either. Challenges that range from secure storage of participants’ data, secure data transmission, online communication and the need for extensive testing and debugging of online tools.

Hand in hand with these opportunities comes a change. A change in how we interact with other people in our offline world. The frequent use of technology and internet does shape our interpersonal communication and interactions as many researchers of the field of cyberpsychology underline. The massive wealth of data individuals leave on the internet, particularly on social media platforms, such as Facebook or Google+ are used to investigate personality factors and their impact on various outcomes. The existence of this data enables scientists to investigate all kinds of hypotheses, ranging from how personality affects consumer behavior to how the use of social media is associated with depression and loneliness.

For those interested in more information on the advantages and pitfalls of online data collection, we highly recommend reading Dennissen, Neumann and van Zalk’s (2010) article.

Book recommendation: Longitudinal data analysis using structural equation models

In the wake of our recent posts about longitudinal studies we’d like to recommend a recently published book by By John J. McArdle and John R. Nesselroade.


Longitudinal studies are on the rise, no doubt. Properly conducting longitudinal studies and then analyzing the data can be a complex undertaking. John McArdle and John Nesselroade focus on five basic questions that can be tackled with structural equation models, when analyzing longitudinal data:

  • Direct identification of intraindividual changes.
  • Direct identification of interindividual differences in intraindividual changes.
  • Examining interrelationships in intraindividual changes.
  • Analyses of causes (determinants) of intraindividual changes.
  • Analyses of causes (determinants) of interindividual differences in intraindividual changes.

I find it especially noteworthy, that the authors put an emphasis on factorial invariance over time and latent change scores. In my view, this makes this book a must read to become a longitudinal data wizard.

Need another argument? Afraid of cumbersome mathematical language? Here is what the authors say about it: „We focus on the big picture approach rather than the algebraic details.“


Cause and effect: Optimizing the designs of longitudinal studies

A rising number of longitudinal studies have been conducted and published in industrial and organizational psychology recently. Although this is a pleasing development, it needs to be considered that most of the published studies are still cross-sectional in nature and thus are far less suited for establishing causal relationships. A longitudinal study can potentially provide insights into the direction of effects and the size of effects over time.

Despite their advantages, designing longitudinal studies needs careful considerations and poses tricky theoretical and methodological questions. As Taris and Kompier put it in their editorial to volume 28 of the journal Work & Stress: “…they are no panacea and could yield disappointing and even misleading findings…“. The authors focus on two crucial challenges in longitudinal designs that have a strong impact on detecting the true effects among a set of constructs.

Choosing the right time lags in longitudinal designs

Failing to choose the right time lag between two consecutive study waves lead to biased estimates of effects (see also Cole & Maxwell, 2003). If the study interval is much shorter than the true interval, the cause has not sufficient time to affect the outcome. In contrary, if the study interval is too long the true effects may already have been vanished. Thus, the estimated size of an effect is strongly linked to the length between two consecutive measurement waves.


The chosen interval should correspond as closely as possible to the true underlying interval. This needs thorough a priori knowledge or reasoning about the possible underlying causal mechanism and time lags before conducting a study. What to do when deducting or estimating an appropriate time lag is not possible? Taris and Kompier (2014) suggest “that researchers include multiple waves in their design, with relatively short time intervals between these waves. Exactly how short will depend on the nature of the variables under study. This way they would maximize the chances of including the right interval between the study waves“. To improve longitudinal research further, the authors propose that researchers report their reasoning for choosing a particular time lag. This would explicitly make temporal considerations what they are a central part of the theoretical foundation of longitudinal study.

Considering reciprocal effects in longitudinal designs

Building on one of their former articles Taris and Kompier(2014) opt for full panel designs meaning that the presumed independent variable as well as the presumed outcome are measured at all waves. Such a design allows testing for reciprocal effects. Not considering existing reciprocal effects in longitudinal analyses may again lead to biased estimates of effects.


A helpful checklist for conducting and publishing Longitudinal Research

Longitudinal research has largely increased in the past 20 years due to an advanced development of new theories and methodologies. Nevertheless, studies in social sciences are still mainly dominated by cross-sectional research designs or deficient longitudinal research, because many researcher lack guidelines for conducting adequate longitudinal research to interpret the duration and change in constructs and variables.

To create a more systematic approach to longitudinal research, Ployhart and Ward (2011) have created a quick start guide on how to conduct high quality longitudinal research.

The following information refers to three stages: the theoretical development of the study design, the analysis of longitudinal results and relevant tips for publishing the respective research. The most relevant information provided by the authors will be shared subsequently in form of a checklist which can help you ameliorate your research ideas and design:

Why is longitudinal research important?

It helps to investigate not only the relationship of two variables over time, but allows to disentangle the direction of effects. It also helps to investigate the change of a variable over time and the duration of this change.  For instance one might investigate how job satisfaction of new hires changes over time and whether certain features of the job (i.e., feedback by the supervisor) predict the form of change. Such questions can only be analyzed through longitudinal investigation with repeated measurements of the construct. In order to study change, at least three waves of data are necessary for a well conducted longitudinal research study (Ployhart & Vandenberg, 2010).

What sample size is needed to conduct longitudinal research?

Since the estimation of power is a complex issue in longitudinal research, the authors do give a rather general answer to this question:  “the answer to this is easy—as large as you can get!“ However, they give a useful rule of thumb. The statistical power depends among other things on the number of subjects and on the number of repeated measures. „If one must choose between adding subjects versus measurement occasions, our recommendation is to first identify the minimum number of repeated measurements required to adequately test the hypothesized form of change and then maximize the number of subjects.“

When to administer measures?

When studying change over time, the timing of measurement is crucial (Mitchell & James 2001). The measurement spacing should adequately capture the expected form of change. Spacing will be different for a linear change as compared to non-linear (e.g., exponential or logarithmic) change. Such thinking is still contrary to common practice. Most of the study designs focus on evenly spaced measurement occasions and give rather sparse focus on the type of change under study. However, it is important that measurement waves occur with enough frequency and cover the theoretically important temporal parts of the change. This needs careful theoretical reasoning beforehand. Done otherwise, the statistical models will over- or underestimate the true nature of the changes under study.

Be it a longitudinal study or a diary study the software of cloud solutions can handle any type of timing and frequency between measurement occasions. The flexibility of our online solutions stem from an “event flow engine” that is based on neural networks.

What to do about missing data?

The statistical analysis of longitudinal research can become complex. One particular challenge in longitudinal data is the treatment of missing data. However, since longitudinal studies often suffer from high dropout rates, having missing data is a very common phenomenon. Here you find recommendations to reduce missing data before and during data collection.  When conducting surveys in organizations a way to enhance response rate is to make sure that the company allows their workers to complete the survey during working hours. A specific technique to reduce the burden on individual participants and still measure frequently over a longer time is planned missingness.

When it comes to handling missing data in statistical analyses, the most important question is whether the data are missing at random or not. If the data are missing at random, there is not much to worry about. The use of full information maximum likelihood estimates will provide unbiased estimates of the missing data points. If the data are not missing at random more sophisticated analytical techniques may be required. Ployhart and Ward (2011) recommend Little and Rubin (2002) for further readings on this issue.

Which analytical method to use?

Simply put, there are three statistical frameworks that can be used to model longitudinal data.

  • Repeated measures General Linear Model: Useful when the focus of interest lies on mean changes within persons over time and missing data is unproblematic.
  • Random coefficient modeling: Useful when one is interested in between – person differences in change over time. Especially useful when the growth models are simple and the predictors of change are static.
  • Structural equation modeling: Useful when one is interested in between – person differences in change over time. Especially useful when with more complex growth models, including time-varying predictors, dynamic relationships, or mediated change.

The following table from Ployhart and Ward (2011) gives a more detailed insight into the application of the three methods:

Use the following method… …when these conditions are present
Repeated measures general linear model Focus on group mean change
Identify categorial predictors of change (e.g. training vs. control group)
Assumptions with residuals are reasonably met
Two waves of repeated data
Variables are highly reliable
Little to no missing data
Random coefficient modeling Focus on individual differences in change over time
Identify continuous or categorial predictors of change
Residuals are correlated, heterogeneous etc.
Three or more waves of data
Variables are highly reliable
Model simple mediated or dynamic models
Missing data are random
 Structural equation modeling Focus on individual differences in change over time
Identify continuous or categorial predictors of change
Residuals are correlated, heterogeneous, etc.
Three or more waves of data
Want to remove unreliability
Model complex mediated or dynamic models

How to make a relevant theoretical contribution worth publishing?

When publishing longitudinal research you should always describe why your longitudinal research is better at explaining the constructs and their relationship than equivalent cross-sectional designs. Then you should underline the superiority of study design as compared to previous ones. Try to go through the following questions when justifying your research’s worth for being published:

  • Have you developed hypotheses from a cross-sectional or from a longitudinal theory?
  • Have you explained why change occurs in your constructs?
  • Have you described why you measured the variables at various times and how this constitutes a sufficient sampling rate?
  • Have you considered threats to internal validity?
  • Have you explained how you reduced missing data?
  • Have you explained why you chose this analytical method?

cloud solutions wishes you success with your longitudinal research!


Wie kriege ich in meiner Studie eine hohe Rücklaufquote hin?

Rücklaufquoten in Fragebogenstudien – Erkenntnisse aus einer Metaanalyse von über 2000 Umfragen.

Forschung in Organisationen ist stark auf Fragebogenuntersuchungen angewiesen. Dabei besteht das Risiko, dass ein bedeutsamer Anteil der angesprochenen Population nicht antwortet. Tiefe Rücklaufquoten führen zu Problemen bei der Generalisierung von Resultaten auf die untersuchte Population (ungenügende externe Validität). Kleine Stichproben auf Grund zu wenig Antwortenden erhöhen zusätzlich das Risiko tiefer statistischer Power und limitieren die Arten von statistischen Techniken, welche angewendet werden können. Einige Forscher gehen davon aus, dass die Popularität von Fragebogenstudien in den letzten Jahren zu einer Verschärfung dieser Risiken geführt hat.

Für ein optimales Design von Studien in Organisationen stellen sich deswegen vor allem zwei Fragen:

  • Haben die Rücklaufquoten bei Fragebogenstudien in den letzten Jahren abgenommen?
  • Falls ja, welche Techniken zur Erhöhung der Rücklaufquote sind heute besonders effektiv?

Antworten auf diese Fragen liefert eine Metaanalyse von Frederik Anseel, Filip Lievens, Eveline Schollaert und Beata Choragwicka publiziert im Journal of Business Psychology. Die Autoren haben über 2000 Fragebogenstudien analysiert, die zwischen 1995 und 2008 in wissenschaftlichen Journalen der Arbeits- und Organisationspsychologie, der Management- und der Marketingwissenschaften publiziert wurden. Mitunter ist diese Studie einer der ersten überhaupt, die im organisationalen Setting den Effekt von online Fragebogenstudien auf Rücklaufquoten untersucht hat.

Die Studie zeigt folgendes:

  • Die Durchschnittliche Rücklaufquote in den analysierten Studien liegt bei 52% mit einer Standardabweichung von 24%.
  • Die Rücklaufquote hat zwischen 1995 und 2008 leicht abgenommen (0.6% pro Jahr). Dieser Effekt wurde aber kompensiert durch den vermehrten Gebrauch von Techniken zur Erhöhung des Rücklaufs.
  • Über alle Gruppen von Befragten sind die Folgenden effektive Techniken zur Erhöhung des Rücklaufs: Versenden von Vorinformationen vor Studienstart, Personalisierung (z.B. persönliche Adressierung der Teilnehmenden), Relevanz des Themas aufzeigen (Salienz steigern), Verwenden von anonymen Identifikationsnummern, universitäres oder anderweitig seriöses Sponsoring, persönliche Verteilung der Fragebogen.
  • Die Durchführung von online Studien ist nicht bei jeder Population gleich sinnvoll. Die Studie zeigt, dass eine online Befragung bei “nicht Managern” (Mitarbeitenden ohne Führungsfunktion) ein effektives Mittel ist, um den Rücklauf zu erhöhen. Bei anderen Gruppen (z.B. Top-Management) kann eine online Durchführung auch zu kleineren Rückläufen führen als bei Papierbefragung.
  • Finanzielle Anreize sind kein effektives Mittel um den Rücklauf zu erhöhen.

Zusammenfassend geben die Autoren folgende Tipps:


Die online Lösungen von Bright Answer unterstützt die Anwendung der oben genannten Techniken. Die Software bietet automatisiertes Versenden persönlich adressierter Vorinformationen, Studieneinladungen und Remindern aber auch die Benutzung von anonymen Zugangscodes ist möglich.
Die Software bietet eine zusätzliche Art von Anreiz für Studienteilnehmende. Am Studienende können die Teilnehmenden automatisch generiert, individuelle Rückmeldungen ansehen und sich mit dem Durchschnitt der restlichen Studienteilnehmenden oder falls vorhanden mit anderen Benchmarks vergleichen. Die Erfahrung zeigt, dass die Aussicht auf eine individuelle Rückmeldung hohe Rücklaufquoten ermöglicht. Gerade in Längsschnittstudien mit vielen Messwellen (5 oder mehr) kann dieser Anreiz viele Dropouts verhindern.