Tutorials Program

Sunday 30th afternoon

T2 Title: The Use of Text Retrieval and Natural Language Processing in Software Engineering

Abstract: During software evolution many related artifacts are created or modified. Some of these are composed of structured data (e.g., analysis data), some contain semi-structured information (e.g., source code), and many include unstructured information (e.g., natural language text). Software artifacts written in natural language (e.g., requirements, bug reports, etc.), together with the comments and identifiers in the source code encode to a large degree the domain of the software, the developers’ knowledge about the system, capture design decisions, developer information, etc. Retrieving and analyzing the textual information existing in software are extremely important in supporting a variety of software engineering (SE) tasks. Text retrieval (TR) is a branch of information retrieval (IR) that leverages information stored primarily in the form of text. TR methods have been proved as suitable candidates for the retrieval and the analysis of textual data embedded in software or present in other sources. In most SE applications, TR techniques are used in conjunction with natural language processing (NLP) tools. We have surveyed the literature over the past decades and found that more than 20 different SE tasks are being addressed through TR and NLP of software documents, published in more than 500 research papers. Examples include tasks such as, traceability link recovery, concept location, change impact analysis, bug triage, refactoring, etc. The tutorial aims at providing sufficient information to allow SE researchers and practitioners to start using TR and NLP methods in their day-to-day work. The tutorial will first give a background on the main TR and NLP techniques used in SE and then will give an overview of the application of these techniques to specific SE tasks. Finally, the tutorial will conclude with a discussion on the challenges, advantages, and limitations of the use of TR and NLP in software engineering and lay out future directions for TR and NLP in SE from the practice and research points of view.


  • Venera Arnaoudova, Washington State University, U.S.A.
  • Short Bio: is Assistant Professor at Washington State University, in Pullman, WA, U.S.A. She received her Ph.D. degree in 2014 from Polytechnique Montréal under the supervision of Dr. Giuliano Antoniol and Dr. Yann-Gaël Guéhéneuc. After her Ph.D. she had the opportunity to work as a postdoctoral research fellow with Dr. Andrian Marcus at the University of Texas at Dallas. Her research interest is in the domain of software evolution and particularly, the analysis of source code lexicon and documentation. Her dissertation focused on the improvement of the code lexicon and its consistency using NLP, fault prediction models, and empirical studies. Arnaoudova has published in several international SE conferences and journals. She served and is serving as program committee member for ICPC 2015; the Early Research Achievements Track at SANER 2015 and ICPC 2013; as external reviewer for ICSE 2014, MSR 2014, CSMR 2013, and others. More information available at: http://www.veneraarnaoudova.com/.

  • Sonia Haiduc, Florida State University, U.S.A.
  • Short Bio: is Assistant Professor at Florida State University, in Tallahassee, FL, U.S.A. Her research interests are in software maintenance, software evolution, and program comprehension. The topic of her recent (i.e., 2013) Ph.D. dissertation focused on the use of NLP and machine learning techniques to improve applications of text retrieval in software engineering, especially with query reformulations. Her papers have been published in several highly selective software engineering venues. She is one of the organizers of the past two editions of the Workshop on Mining Unstructured Data in Software Engineering (MUD) and is currently a member of the organizing committee for ICSE 2015, ICPC 2015, and SCAM 2015. She has also been involved in the organizing committee of several previous conferences in the field. Haiduc has also served as a program committee member for several conferences, including MSR, ICSME, ICPC, SANER, WCRE, CSMR, etc. More information is available at: http://www.cs.fsu.edu/~shaiduc/.

  • Andrian Marcus, The University of Texas at Dallas, U.S.A.
  • Short Bio: is Associate Professor at The University of Texas at Dallas, in Richardson, TX, USA. His current research interests are focused on software evolution and program comprehension. He is best known for his more than decade-long work on using information retrieval and text mining techniques for software analysis to support comprehension tasks during software evolution, such as: concept location, impact analysis, error prediction, traceability link recovery, etc. Marcus received several Best Paper Awards and he is the recipient of the NSF CAREER award in 2009. Marcus gave more than 20 invited seminars of tutorials on the use of text retrieval techniques to support SE tasks at various universities, companies, and summer schools. He was the Chair of the steering committee of ICSME and served on many conferences as chair and program committee member and also serves on the editorial board of three SE journals (TSE, EMSE, JSEP). Together with Antoniol, they presented tutorials or technical briefings on related topics at: ASE 2010, ASE 2011, ESEC/FSE 2011, ICSE 2012. More information available at: http://www.utdallas.edu/~amarcus/.

  • Giuliano Antoniol, Ecole Polytechnique de Montréal, Canada
  • Short Bio: is Professor at Polytechnique Montréal, where he works in the areas of software evolution, software traceability, search-based software engineering, and software maintenance. He worked for several software companies, research institutions and universities. In 2005 he was awarded the Canada Research Chair Tier I in Software Change and Evolution. He published more than 190 papers in journals and international conferences, several works on applying information retrieval approaches to software engineering. Some of his papers received Best Paper Awards. He served as program chair, industrial chair, tutorial, and general chair of many international conferences and workshops, on the editorial boards of five journals, and on the program committees of more than 30 IEEE and ACM conferences and workshops. More information available at: http://www.antoniol.net/.

Monday 31st morning

T3 Title: Software Engineering for Mobile Apps: Research Accomplishments and Future Challenges

Abstract: Over the past few years, we have seen a boom in the popularity of mobile devices and mobile apps which run on these devices. These modern apps bring a whole slew of new challenges to software practitioners. Traditional mobile challenges such as limited processing power are no longer as relevant, instead a whole set of software engineering challenges have emerged due to the highly-connected nature of these devices, their unique distribution channels (i.e., app markets like the Apple App store and Android Play market), and novel revenue models (e.g., freemuim and subscription apps). Mobile software engineering brings many fundamentally different research challenges and opportunities to software engineering researchers. Recently, researchers have begun to focus on mobile software engineering issues. For example, the 2011 Mining Software Engineering Challenge focused on studying the Android mobile platform. Other work focused on issues related to code reuse in mobile apps, on mining mobile app data from the app stores, testing mobile apps and security of mobile apps. This tutorial presents the latest research in Mobile Software Engineering (MSE). First, we present the differences between mobile apps and traditional desktop applications. Then, we discuss the state-of-the-art research on MSE code and user-perceived quality. We also highlight recent findings on the impact of various mobile specific issues (e.g., monetization, mobility) on traditional software engineering problems. Lastly, the tutorial will present future challenges and opportunities in the area of MSE and provide some resources to further enhance research in this upcoming research area.The tutorial will also highlight data sources such as repositories of open source mobile apps and app store data to enable future research on this topic.


  • Emad Shihab, Concordia University, Canada
  • Short Bio: Emad Shihab is an assistant professor with the Department of Computer Science and Software Engineering at Concordia University. Dr. Shihab held various roles working in the mobile software industry, including working as a software researcher and a software quality assurance associate at Research In Motion in Waterloo, Ontario. Dr. Shihab serves as a co-organizer of the 2013 International Workshop on the Engineering of Mobile-Enabled Systems (MOBS). Dr. Shihab received his B.Eng and MASc. from the University of Victoria and his PhD from Queen’s University. More information can be found at: http://users.encs.concordia.ca/~eshihab

  • Meiyappan Nagappan, Rochester Institute of Technology, USA
  • Short Bio: Meiyappan Nagappan is an assistant professor in the Rochester Institute of Technology’s Department of Software Engineering. He previously was a postdoctoral fellow in the Software Analysis and Intelligence Lab at Queen's University. His research centers on using large-scale software engineering data to address stakeholders’ concerns. Dr. Nagappan received a PhD in computer science from North Carolina State University. He received a best-paper award at the 2012 International Working Conference on Mining Software Repositories. He is the newly appointed Editor of the IEEE Software Blog. Contact him at mei@se.rit.edu; mei-nagappan.com

Monday 31st afternoon

T4 Title: The Art and Science of Analyzing Software Data

Abstract: Using the tools of quantitative data science, software engineers that can predict useful information on new projects based on past projects. This tutorial reflects on the state-of-the-art in quantitative and qualitative reasoning in this important field. This tutorial discusses the following: (a) when local data is scarce, we show how to adapt data from other organizations to local problems; (b) when working with data of dubious quality, we show how to prune spurious information; (c) when data or models seem too complex, we show how to simplify data mining results; (d) when the world changes, and old models need to be updated, we show how to handle those updates; (e) When the effect is too complex for one model, we show to reason over ensembles. Based on feedback from the ICSE’14 tutorial, we will also extend that material as follows. We will add to the 2015 briefing notes on (f) certain landmark results in the history of the field plus (g) more recent landmark and very influential papers such as the association rule methods for guiding programmers to related changes (the methods won a “Most Influential Paper” award at ICSE’14 for Zimmermanetal “Mining Version Histories to Guide Software Changes”). Target audience: Software practitioners and researchers wanting to understand the state of the art in using data mining for software engineering (SE) data. Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developers and technical managers.


  • Leandro L. Minku, University of Birmingham, UK
  • Short Bio: is a Research Fellow at the Centre of Excellence for Research in Computational Intelligence and Applications (CERCIA), School of Computer Science, the University of Birmingham (UK). He received the PhD degree in Computer Science from the University of Birmingham (UK) in 2011, and was an intern at Google Zurich for six months in 2009/2010. He was the recipient of the Overseas Research Students Award (ORSAS) from the British government and several scholarships from the Brazilian Council for Scientific and Technological Development (CNPq). His research focuses on software prediction models, online/incremental machine learning for changing environments, and ensembles of learning machines. He has published in internationally renowned journals such as ACM Transactions on Software Engineering and Methodology, IEEE Transactions on Knowledge and Data Engineering, and Neural Networks.

  • Fayola Peters, Lero, University of Limerick, Ireland
  • Short Bio: is a Postdoctoral Researcher with Lero - The Irish Software Engineering Research Center at the University of Limerick in Ireland. She received the PhD in Computer Science from West Virginia University in 2014. Her research focuses on handling privacy issues related to supporting privacy preserving data sharing for data owners as well as software users. She has published at top software engineering venues like ICSE, IEEE TSE and ESEM. She has also been a curator for the PROMISE repository since 2011.

Tuesday 1st morning

T5 Title: Combining Quantitative and Qualitative Methods in Empirical Software Engineering

Abstract: Software engineering research is becoming quite mature for what concerns the application of quantitative methods for empirical research and for empirical assessment in general. However, in most cases empirical assessment in software en- gineering is limited to a quantitative one. This in many cases means having a limited view of an observed phenomenon, especially in showing cause-effect-relationships between cor- related variables. Using a learn-by-example approach, this tutorial elaborates on how quantitative methods for empir- ical software engineering can be combined with qualitative ones, and how this complement can contribute to a comprehensive empirical analysis and interpretation in the context of software engineering problems.


  • Massimiliano Di Penta, Università del Sannio, Italy
  • Short Bio: is associate professor at the University of Sannio, Department of Engineering, Italy. His research interests include software maintenance and evolution, software testing, search-based software engineering, and empirical software engineering with a focus on quantitative empirical methods as well as on mining software repositories. He is author of over 200 papers appeared on journals, conferences and workshops and has served more than 100 organising and program committees of various software engineering conferences. He is in the editorial board of the IEEE Transactions on Software Engineering, Empirical Software Engineering Journal edited by Springer, and of the Journal of Software Evolution and Processes edited by Wiley. More info can be found at www.ing.unisannio.it/mdipenta

  • Damian A. Tamburri, Politecnico di Milano, Italy
  • Short Bio: is a post-doctoral fellow at Politecnico di Milano, Italy. His research interests include empirical and social software engineering, with a focus on software development communities and processes for their organisational and social improvement. Damian uses empirical methods, (so far) with a focus on qualitative research. Although in his first year of post-doc, he is author to over 25 papers and his empirical research has appeared in top Journals and venues such as the IEEE Transactions on Software Engineering Journal, the IEEE Software Magazine, the ACM Computing Surveys Journal, and the International Conference on Software Engineering (ICSE) Proceedings. Finally, he was recently instated as Italian Translation Editor and Facebook Ambassador for IEEE Software. More info can be found at www.researchgate.net/profile/ Damian_Tamburri

Tuesday 1st afternoon

T6 Title: Challenges of Conducting Software Engineering Experiments: Everything You Always Wanted to Know but Were Afraid to Ask

Abstract: Experimentation is a key issue in science and engineering. But it is one of software engineering’s stumbling blocks. Quite a lot of experiments are run nowadays, but it is a risky business. Software engineering has some special features, leading to some experimentation issues being conceived of differently than in other disciplines. The aim of this tutorial is to help participants to avoid common pitfalls when running software engineering experiments. The tutorial is not intended as an experimental design and analysis course, because there is already plenty of literature on this subject. The tutorial reviews several shortcomings that we have identified in published SE experiments. We go over the stages of an experiment, encouraging discussion about key challenging decisions.


  • Natalia Juristo, Universidad Politécnica de Madrid, Spain
  • Short Bio: received her PhD degree from the Universidad Politécnica de Madrid in 1991. She is currently full professor of software engineering at Universidad Politecnica de Madrid. She was awarded a FiDiPro (Finland Distinguished Professor Program) professorship, starting in January 2013. She was the Director of the UPM MSc in Software Engineering from 1992 to 2002 and coordinator of the Erasmus Mundus European Master on SE (with the participation of the University of Bolzano, the University of Kaiserslautern and the University of Blekinge) from 2006 to 2012. Her main research interests are experimental software engineering, requirements and testing. She co-authored the book Basics of Software Engineering Experimentation (Kluwer, 2001). She is a member of the editorial boards of IEEE Transactions on SE and Empirical SE Journal. She began her career as a developer at the European Space Agency (Rome) and the European Center for Nuclear Research (Geneva). She was a resident affiliate at the CMU Software Engineering Institute in Pittsburgh in 1992.

  • Sira Vegas, University of Oulu, Finland
  • Short Bio: is associate professor of software engineering at Universidad Politécnica de Madrid (UPM), Spain. She was a summer student at the European Centre for Nuclear Research (CERN, Geneva) in 1995. She was a regular visiting scholar of the Experimental Software Engineering Group at the University of Maryland from 1998 to 2000, and visiting scientist at the Fraunhofer Institute for Experimental Software Engineering in Germany in 2002. She has been a member of several program committees, including SEKE, ESEM, CSEET and ICSE-NIER. She is a reviewer of highly ranked journals such as IEEE Transactions on Software Engineering, Empirical Software Engineering Journal, ACM Transactions on Software Engineering and Methodology and Information and Software Technology. Dr. Vegas was program chair for the International Symposium on Empirical Software Engineering and Measurement (ESEM) in 2007.