References
Abelson, H. (1986). Lecture 1A: Overview and introduction to lisp
[lecture transcript]. MIT OpenCourseWare 6.001 Structure and
Interpretation of Computer Programs. https://ocw.mit.edu/courses/6-001-structure-and-interpretation-of-computer-programs-spring-2005/resources/1a-overview-and-introduction-to-lisp/
Arshad, A., Ghaleb, T., & Ralph, P. (2021). Towards a more
structured peer review process with empirical standards. Proceedings
of the 25th International Conference on Evaluation and Assessment in
Software Engineering, 353–358. https://doi.org/10.1145/3463274.3463359
Arvanitou, E.-M., Ampatzoglou, A., Chatzigeorgiou, A., & Carver, J.
C. (2021). Software engineering practices for scientific software
development: A systematic mapping study. Journal of Systems and
Software, 172, 915–929. https://doi.org/10.1016/j.jss.2020.110848
Barua, A., Thomas, S. W., & Hassan, A. E. (2014). What are
developers talking about? An analysis of topics and trends
in Stack Overflow. Empirical Software
Engineering, 19(3), 619–654. https://doi.org/10.1007/s10664-012-9231-y
Begel, A., & Zimmermann, T. (2014). Analyze this! 145 questions for
data scientists in software engineering. Proceedings of the 36th
International Conference on Software Engineering, 12–23. https://doi.org/10.1145/2568225.2568233
Beller, M., Spruit, N., Spinellis, D., & Zaidman, A. (2018). On the
dichotomy of debugging behavior among programmers. Proceedings of
the 40th International Conference on Software Engineering, 572–583.
https://doi.org/10.1145/3180155.3180175
Blackburn, S. M. et al. (2006). The DaCapo
benchmarks: Java benchmarking development and analysis.
Proceedings of the 21st Annual ACM SIGPLAN Conference on
Object-Oriented Programming Systems, Languages, and Applications,
169–190. https://doi.org/10.1145/1167473.1167488
Boehm, B. W., Elwell, J. F., Pyster, A. B., Stuckle, E. D., &
Williams, R. D. (1982). The TRW software productivity system.
Proceedings of the 6th International Conference on Software
Engineering, 148–156. https://dl.acm.org/doi/10.5555/800254.807757
Booth, W. C., Colomb, G. G., Williams, J. M., Bizup, J., &
FitzGerald, W. T. (2016). The craft of research (4th ed.).
University of Chicago Press.
Burns, R. B. (2000). Introduction to research methods (4th
ed.). SAGE Publications.
Carrera-Rivera, A., Ochoa, W., Larrinaga, F., & Lasa, G. (2022).
How-to conduct a systematic literature review: A quick guide for
computer science research. MethodsX, 9, 101895. https://doi.org/10.1016/j.mex.2022.101895
Carvalho, L., Degiovanni, R., Cordy, M., Aguirre, N., Le Traon, Y.,
& Papadakis, M. (2024). SpecBCFuzz: Fuzzing LTL solvers with
boundary conditions. Proceedings of the IEEE/ACM 46th International
Conference on Software Engineering. https://doi.org/10.1145/3597503.3639087
Choudhuri, R., Liu, D., Steinmacher, I., Gerosa, M., & Sarma, A.
(2024). How far are we? The triumphs and trials of generative AI in
learning software engineering. Proceedings of the IEEE/ACM 46th
International Conference on Software Engineering. https://doi.org/10.1145/3597503.3639201
Claes, M., Mäntylä, M. V., Kuutila, M., & Adams, B. (2018). Do
programmers work at night or during the weekend? Proceedings of the
40th International Conference on Software Engineering, 705–715. https://doi.org/10.1145/3180155.3180193
Creswell, J. W., & Creswell, J. D. (2018). Research design:
Qualitative, quantitative, and mixed methods approaches (5th ed.).
Sage.
Denning, P. J. (2005). Is computer science science? Commun.
ACM, 48(4), 27–31. https://doi.org/10.1145/1053291.1053309
Dubey, R. K., Thrash, T., Kapadia, M., Hoelscher, C., & Schinazi, V.
R. (2021). Information theoretic model to simulate agent-signage
interaction for wayfinding. Cognitive Computation,
13(1), 189–206.
Easterbrook, S., Singer, J., Storey, M.-A., & Damian, D. (2008).
Selecting empirical methods for software engineering research. In F.
Shull, J. Singer, & D. I. K. Sjøberg (Eds.), Guide to advanced
empirical software engineering (pp. 285–311). Springer London. https://doi.org/10.1007/978-1-84800-044-5_11
Futatsugi, K., & Okada, K. (1982). A hierarchical structuring method
for functional software systems. Proceedings of the 6th
International Conference on Software Engineering, 393–402. https://dl.acm.org/doi/10.5555/800254.807782
Gray, J. (1992). Benchmark handbook: For database and transaction
processing systems. Morgan Kaufmann Publishers Inc.
Habiba, U.-., Habib, M. K., Bogner, J., Fritzsch, J., & Wagner, S.
(2024). How do ML practitioners perceive explainability? An interview
study of practices and challenges. Empirical Softw. Engg.,
30(1). https://doi.org/10.1007/s10664-024-10565-2
Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2012). A
systematic literature review on fault prediction performance in software
engineering. IEEE Transactions on Software Engineering,
38(6), 1276–1304. https://doi.org/10.1109/TSE.2011.103
Hoda, R., Noble, J., & Marshall, S. (2013). Self-organizing roles on
agile software development teams. IEEE Transactions on Software
Engineering, 39(3), 422–444. https://doi.org/10.1109/TSE.2012.30
Huang, Y., Wang, J., Liu, Z., Wang, Y., Wang, S., Chen, C., Hu, Y.,
& Wang, Q. (2024). CrashTranslator: Automatically reproducing mobile
application crashes directly from stack trace. Proceedings of the
IEEE/ACM 46th International Conference on Software Engineering. https://doi.org/10.1145/3597503.3623298
Huijgens, H., Rastogi, A., Mulders, E., Gousios, G., & Deursen, A.
van. (2020). Questions for data scientists in software engineering: A
replication. Proceedings of the 28th ACM Joint Meeting on European
Software Engineering Conference and Symposium on the Foundations of
Software Engineering, 568–579. https://doi.org/10.1145/3368089.3409717
Huppler, K. (2009). The art of building a good benchmark. In R. Nambiar
& M. Poess (Eds.), Performance evaluation and benchmarking
(pp. 18–30). Springer Berlin Heidelberg.
Inal, Y., Clemmensen, T., Rajanen, D., Iivari, N., Rizvanoglu, K., &
Sivaji, A. (2020). Positive developments but challenges still ahead: A
survey study on UX professionals’ work practices. J. Usability
Studies, 15(4), 210–246.
Inayat, I., Salim, S. S., Marczak, S., Daneva, M., & Shamshirband,
S. (2015). A systematic literature review on agile requirements
engineering practices and challenges. Computers in Human
Behavior, 51, 915–929. https://doi.org/10.1016/j.chb.2014.10.046
Jedlitschka, A., & Pfahl, D. (2005). Reporting guidelines for
controlled experiments in software engineering. 2005 International
Symposium on Empirical Software Engineering, 2005., 1–10. https://doi.org/10.1109/ISESE.2005.1541818
Kabir, S., Udo-Imeh, D. N., Kou, B., & Zhang, T. (2024). Is
Stack Overflow obsolete? An empirical study of
the characteristics of ChatGPT answers to
Stack Overflow questions. Proceedings of
the 2024 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3613904.3642596
Kalliamvakou, E., Gousios, G., Blincoe, K., Singer, L., German, D. M.,
& Damian, D. (2014). The promises and perils of mining
GitHub. Proceedings of the 11th Working
Conference on Mining Software Repositories, 92–101. https://doi.org/10.1145/2597073.2597074
Kampenes, V. B., Dybå, T., Hannay, J. E., & Sjøberg, D. I. K.
(2007). A systematic review of effect size in software engineering
experiments. Information and Software Technology,
49(11), 1073–1086. https://doi.org/10.1016/j.infsof.2007.02.015
Kazemi, M. et al. (2025). BIG-bench extra hard.
Proceedings of the 63rd Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers), 26473–26501. https://doi.org/10.18653/v1/2025.acl-long.1285
Keshav, S. (2007). How to read a paper. SIGCOMM Comput. Commun.
Rev., 37(3), 83–84. https://doi.org/10.1145/1273445.1273458
Kitchenham, B. A., Dyba, T., & Jorgensen, M. (2004). Evidence-based
software engineering. Software Engineering, 2004. ICSE 2004.
Proceedings. 26th International Conference on, 273–281. https://doi.org/10.1109/ICSE.2004.1317449
Kitchenham, B., & Charters, S. (2007). Guidelines for performing
systematic literature reviews in software engineering (Technical
Report EBSE-2007-01). Keele University; Durham University Joint Report.
https://legacyfileshare.elsevier.com/promis_misc/525444systematicreviewsguide.pdf
Kounev, S., Lange, K.-D., & Kistowski, J. von. (2025). Systems
benchmarking: For scientists and engineers (2nd ed.). Springer. https://doi.org/10.1007/978-3-031-85634-1
Krause, A., Kaur, H., Klemmer, J. H., Wiese, O., & Fahl, S. (2025).
“That’s my perspective from 30 years of doing this”: An
interview study on practices, experiences, and challenges of updating
cryptographic code. 34th USENIX Security Symposium, 2907–2926.
Lamport, L. (2012). How to write a 21st century proof. Journal of
Fixed Point Theory and Applications, 11(1), 43–63.
Lawrance, J., Bogart, C., Burnett, M., Bellamy, R., Rector, K., &
Fleming, S. D. (2013). How programmers debug, revisited: An information
foraging theory perspective. IEEE Transactions on Software
Engineering, 39(2), 197–215. https://doi.org/10.1109/TSE.2010.111
Miao, X., Wu, Y., Chen, L., Gao, Y., & Yin, J. (2023). An
experimental survey of missing data imputation algorithms. IEEE
Transactions on Knowledge and Data Engineering, 35(7),
6630–6650. https://doi.org/10.1109/TKDE.2022.3186498
Munaiah, N., Kroh, S., Cabrey, C., & Nagappan, M. (2017). Curating
GitHub for engineered software projects.
Empirical Software Engineering, 22(6), 3219–3253. https://doi.org/10.1007/s10664-017-9512-6
Nakamoto, Y., Iwamoto, T., Hori, M., Hagihara, K., & Tokura, N.
(1982). An editor for documentation in π-system to support software
development and maintenance. Proceedings of the 6th International
Conference on Software Engineering, 330–339. https://dl.acm.org/doi/10.5555/800254.807775
OECD. (2015). Frascati manual 2015: Guidelines for collecting and
reporting data on research and experimental development (p. 398).
OECD Publishing. https://doi.org/10.1787/9789264239012-en
Park, J. S., O’Brien, J., Cai, C. J., Morris, M. R., Liang, P., &
Bernstein, M. S. (2023). Generative agents: Interactive simulacra of
human behavior. Proceedings of the 36th Annual ACM Symposium on User
Interface Software and Technology. https://doi.org/10.1145/3586183.3606763
Rothlisberger, D., Harry, M., Binder, W., Moret, P., Ansaloni, D.,
Villazon, A., & Nierstrasz, O. (2012). Exploiting dynamic
information in IDEs improves speed and correctness of software
maintenance tasks. IEEE Transactions on Software Engineering,
38(3), 579–591. https://doi.org/10.1109/TSE.2011.42
Saunders, B., Sim, J., Kingstone, T., Baker, S., Waterfield, J.,
Bartlam, B., Burroughs, H., & Jinks, C. (2018). Saturation in
qualitative research: Exploring its conceptualization and
operationalization. Quality & Quantity, 52(4),
1893–1907. https://doi.org/10.1007/s11135-017-0574-8
Shahin, M., Liang, P., & Babar, M. A. (2014). A systematic review of
software architecture visualization techniques. Journal of Systems
and Software, 94(Supplement C), 161–185. https://doi.org/10.1016/j.jss.2014.03.071
Shreeve, B., Gralha, C., Rashid, A., Araújo, J., & Goulão, M.
(2023). Making sense of the unknown: How managers make cyber security
decisions. ACM Trans. Softw. Eng. Methodol., 32(4). https://doi.org/10.1145/3548682
Steimann, F. (2018). Fatal abstraction. Proceedings of the 2018 ACM
SIGPLAN International Symposium on New Ideas, New Paradigms, and
Reflections on Programming and Software, 125–130. https://doi.org/10.1145/3276954.3276966
Stol, K.-J., & Fitzgerald, B. (2018). The ABC of software
engineering research. ACM Trans. Softw. Eng. Methodol.,
27(3). https://doi.org/10.1145/3241743
Vidoni, M. (2022). A systematic process for mining software
repositories: Results from a systematic literature review. Inf.
Softw. Technol., 144(C). https://doi.org/10.1016/j.infsof.2021.106791
Wobbrock, J. O., & Kientz, J. A. (2016). Research contributions in
human-computer interaction. Interactions, 23(3),
38–44. https://doi.org/10.1145/2907069
Wohlin, C., & Aurum, A. (2015). Towards a decision-making structure
for selecting a research design in empirical software engineering.
Empirical Softw. Engg., 20(6), 1427–1455. https://doi.org/10.1007/s10664-014-9319-7
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., &
Wesslén, A. (2024). Systematic literature studies. In
Experimentation in software engineering (2nd ed., pp. 51–63).
Springer. https://doi.org/10.1007/978-3-662-69306-3_4
Yang, D., Martins, P., Saini, V., & Lopes, C. (2017). Stack
Overflow in GitHub: Any snippets
there? 2017 IEEE/ACM 14th International Conference on Mining
Software Repositories (MSR), 280–290. https://doi.org/10.1109/MSR.2017.13
Yang, X., Lo, D., Xia, X., Zhang, Y., & Sun, J. (2015). Deep
learning for just-in-time defect prediction. 2015 IEEE International
Conference on Software Quality, Reliability and Security, 17–26. https://doi.org/10.1109/QRS.2015.14
Zeller, A., & Lütkehaus, D. (1996). DDD—a free graphical front-end
for UNIX debuggers. SIGPLAN Not., 31(1), 22–27. https://doi.org/10.1145/249094.249108
Zobel, J. (2014). Writing for computer science. Springer. https://doi.org/10.1007/978-1-4471-6639-9