4 Choosing Research Questions and Methods – Research Methods in Computer Science

4.1 Research Questions

A research question (RQ) is a clearly formulated question that we would like to answer in our study using research methods. In general, a good research question should be:

Focused: If we have multiple research questions per paper, they should all relate to the main goal.
Researchable and feasible: We must be able to answer it scientifically and using reasonable resources.
Specific: A research question should not be vague.
Complex enough and relevant: We should not study questions that will be useless for sure.

To provide more specific examples, Begel & Zimmermann (2014) and Huijgens et al. (2020) provide lists of questions that industrial software engineers find relevant. Similar lists may exist for other subfields.

Sometimes, a research question (or a hypothesis) is only implicit in a paper. This is the case mainly for papers whose main contribution is an artifact – the research question then might be “Is it feasible to design X?” or “How to design an approach X so that it has some property Y?”. However, particularly for empirical studies, we should always state a research question explicitly.

4.2 Hypotheses

A hypothesis is a declarative sentence whose truthfulness is yet unknown. It often specifies a relationship between variables, e.g., “A high-contrast mouse cursor improves the clicking precision on a target rectangle on a screen.” A hypothesis must be falsifiable, which means we can reject it using an empirical research method.

Any hypothesis can be written as a research question, but the reverse does not always hold. For instance, the question “How do developers unit test database modules in procedural languages?” cannot be written as a hypothesis. If a statement is statistically testable, the hypothesis is a preferred form.

A hypothesis is a statement about a more specific phenomenon, such as “When debugging, having dynamic information from Senseo (Rothlisberger et al., 2012) available reduces the time for solving maintenance tasks.” In contrast, a theory is a proposition explaining a certain general phenomenon. It is usually based on the generalization of a large quantity of evidence in the area. For example, an information foraging theory for debugging (Lawrance et al., 2013), explains how the developers navigate the program, similarly to a predator following prey, using concepts such as scent, proximal cues, or topology.

4.3 Operationalization

We know that a research question or a hypothesis should be specific and precisely defined. However, even well-stated hypotheses and RQs contain terms with many possible interpretations. Consider the hypothesis:

Using a time-traveling debugger improves developers’ efficiency in fixing bugs.

There exist multiple time-traveling debuggers with different features. Many people consider themselves developers: from high-school students coding for fun to senior developers in corporations. What actually constitutes efficiency is a notoriously debated problem.

We thus have to provide operational definitions of important terms. An operational definition expresses how the given term will be interpreted in this specific study, i.e., how it will be measured, observed, or applied.

For example, we will use a specific time-traveling debugger TimeDebugX, implementing certain features. By “developers”, we will understand professional programmers with at least one-year industrial Java experience. We will measure “efficiency” as the number of successfully solved bug-fixing tasks per hour.

4.4 Choosing a Research Method

Many beginning researchers make a mistake of selecting a research method without having any specific hypothesis or RQ in mind. For instance, they design a new approach, which they would like to empirically evaluate. They start thinking about a questionnaire survey as a relatively easy way to collect data from multiple respondents. They start adding many questions to the survey, ranging from demographic details of every kind to many unclear and suggestive questions about how the respondents liked various aspects of the newly designed system.

This is definitely a wrong practice. When doing research, we should always define RQs/hypotheses first and only then proceed to choosing research methods. Otherwise, we risk that the method will answer only questions that are irrelevant or will not reliably answer any research question at all.

There are multiple guides and overviews suggesting the most suitable research methods for given types of research questions, e.g., by Wohlin & Aurum (2015) or Easterbrook et al. (2008). In the next chapters, we will describe some of the frequently used research methods, including controlled experiments, surveys, interviews, case studies, and benchmarking studies among others. For each method, we will mention examples of RQs it is suitable for answering. However, in practice, the methods and their principles are sometimes not clearly separated, so we may use, e.g., the concept of sampling known from surveys in a benchmarking study.

During a research project, it is often beneficial to use multiple methods or data sources in a practice called triangulation. For example, we can obtain requirements for our new approach using a survey, assess its machine-based efficiency by a benchmark, find its advantages and disadvantages in a case study, and finally assess its human-based efficiency using a controlled experiment with humans. This increases the validity of the study and provides a more comprehensive understanding of the subject.

Exercises

Which of the following are good research questions and why?
1. Why do students fail in the operating systems bachelor’s courses?
2. How to determine if a program written in C terminates without running it?
3. What proportion of C++ Gists on GitHub is compilable without downloading third-party libraries?
Which of the following are good hypotheses? Why?
1. Java is a fast programming language.
2. An LL parser is faster than an LR parser for converting JSON into an in-memory tree structure.
Find a paper in your field that explicitly lists operational definitions of terms used in a hypothesis or a research question.