Employment

  • Present 2022

    Associate Professor

    Carnegie Mellon University, Pittsburgh, PA USA

  • 2022 2019

    Principal Researcher

    Microsoft Research, Redmond, WA USA

  • Present 2013

    Affiliate Professor

    University of Washington, Information School, Seattle, WA USA

  • 2019 2014

    Senior Researcher

    Microsoft Research, Redmond, WA USA

  • Winter 2013

    Visiting Lecturer

    University of Washington, Information School

  • 2014 2006

    Researcher

    Microsoft Research, Redmond, WA USA

  • Summer 1997

    Intern

    Xerox PARC Computer Science Laboratory, Palo Alto, CA USA
    Supervisor: Michael Spreitzer

  • 2001 1993

    UROP

    (Undergraduate Research Opportunities Program)

    MIT Media Lab, Cambridge, MA USA
    Supervisor: Mitchel Resnick

  • Summer 1992

    Research Intern

    NYU Medical Center: Institute for Environmental Medicine, Sterling Forest, NY USA
    Supervisor: Toby G. Rossman

  • Summer 1990

    Intern

    NY Medical College: Biochemistry and Molecular Biology, Valhalla, NY USA
    Supervisor: Yuk-Ching Tse-Dinh

  • 1992 1988

    SIGOp: Apple II Special Interest Group

    Mnematics Videotex, Inc., Sparkill, NY USA

Education

  • Ph.D. 2005

    Ph.D. in Computer Science

    University of California at Berkeley

    Advisor: Susan L. Graham

  • M.Eng.1997

    Master of Engineering in Electrical Engineering and Computer Science

    Massachusetts Institute of Technology

    Advisor: Mitchel Resnick

  • B.S.1996

    Bachelor of Science in Computer Science and Engineering

    Massachusetts Institute of Technology

    Advisor: Mitchel Resnick

  • High School Regents Diploma1992

    Rank: 2 out of 443, Average 102.7

    Ramapo Senior High School, Spring Valley, NY

Honors and Awards

  • 2019
    ACM Distinguished Member
  • 2005
    Demitri Angelakos Memorial Award
  • 1992
    National Merit Scholarship
    Salutatorian of High School Class of 1992
    Leo V. Dustman Award in Mathematics
    Mel Sobel Microscopes Scholarship Award
    New York State Science Supervisors Association Award

Research Colleagues

I have worked with many very smart researchers, students, and visitors at Berkeley, Microsoft, and CMU.

You can see a list of my current students on my VariAbility Lab homepage.

Older Research Projects

  • image

    Codebook

    Microsoft Research

    Social Media for Software Engineers

    We use social networking to connect people and artifacts in software development-related repositories.

  • image

    Deep Intellisense

    Microsoft Research

    Dig up the Dirt on your Code

    Deep Intellisense is a Visual Studio 2008 plugin that surfaces information from various silos (source control, bug tracking, mailing lists, etc.) to provide developers with instant context-sensitive feedback on any source code they are reading in the editor. Deep Intellisense works with Visual Studio Team Foundation System projects (such as those hosted on CodePlex), email archives (from Outlook) and Sharepoint sites.

  • image

    Agile Methods

    Microsoft Research

    Studies of Agile Development at Microsoft

  • image

    Onboarding New Software Engineers

    Microsoft Research

    Studies of Collocated and Remote Onboarding

  • image

    Software Team Coordination

    Microsoft Research

    Studies of Inter-team Coordination

  • image

    Programming by Voice

    University of California at Berkeley

    SPEED: SPEech EDitor: Code Dictation, Editing by Voice, Commenting by Voice

    Programmers who suffer from repetitive stress injuries find it difficult to program by typing. Speech interfaces can reduce the amount of typing, but existing programming-by-voice techniques make it awkward for programmers to enter and edit program text. We used a human-centric approach to address these problems. We first studied how programmers verbalize code, and found that spoken programs contain lexical, syntactic and semantic ambiguities that do not appear in written programs. Using the results from this study, we designed Spoken Java, a semantically identical variant of Java that is easier to speak. Inspired by a study of how voice recognition users navigate through documents, we developed a novel program navigation technique that can quickly take a software developer to a desired program position.

    Spoken Java is analyzed by extending a conventional Java programming language analysis engine written in our Harmonia program analysis framework. Our new XGLR parsing framework extends GLR parsing to process the input stream ambiguities that arise from spoken programs (and from embedded languages). XGLR parses Spoken Java utterances into their many possible interpretations. To semantically analyze these interpretations and discover which ones are legal, we implemented and extended the Inheritance Graph, a semantic analysis formalism which supports constant-time access to type and use-definition information for all names defined in a program. The legal interpretations are the ones most likely to be correct, and can be presented to the programmer for confirmation.

    We built an Eclipse IDE plugin called SPEED (for SPEech EDitor) to support the combination of Spoken Java, an associated command language, and a structure-based editing model called Shorthand. Our evaluation of this software with expert Java developers showed that most developers had little trouble learning to use the system, but found it slower than typing.

    Although programming-by-voice is still in its infancy, it has already proved to be a viable alternative to typing for those who rely on voice recognition to use a computer. In addition, by providing an alternative means of programming a computer, we can learn more about how programmers communicate about code.

  • image

    Harmonia

    University of California at Berkeley

    A Framework for Language-Aware Programming Tools

    Harmonia is an open, extensible framework for constructing interactive, language-aware programming tools. Harmonia is a descendent of our earlier projects, Pan and Ensemble and utilizes many analysis technologies developed for those projects. Harmonia includes an incremental GLR parser (which admits a more natural syntax specification than LR), a static semantic analysis engine, and other language-based facilities. Program source code is represented by annotated abstract syntax trees augmented with non-linguistic material such as whitespace and comments. The analysis engine can support any textual language that has formal syntactic and semantic specifications. The incremental nature of the analysis supports a history mechanism that is used both for history-based diagnostic information and for contextual rollback. New languages can be easily added to Harmonia by giving the system a syntactic and semantic description, which is compiled into a dynamically loadable extension for that language. Among the languages for which descriptions exist are Java, Cool (a teaching language), XML, Scheme, Cobol, C, and C++. Other languages are being added to Harmonia as resources permit.

    The language technology implemented in the Harmonia framework is being used in two current research projects: support for high-level interactive transformations and programming by voice. Our research in interactive program transformations focuses on the problem of programmers' expression and interaction with a programming tool. We are combining the results from psychology of programming, user-interface design, software visualization, program analysis, and program transformation to create a novel programming environment that enables the programmer to express source code manipulations in a high-level conceptual manner. Programming by voice research augments traditional text editing by allowing the developer dictate chunks of program source code as well as verbalize high-level editing operations. This research helps to lower frustrating barriers for software developers that suffer from repetitive strain injuries and other related disabilities that make typing difficult or impossible.

    Harmonia can be used to augment text editors to robustly support the language-aware editing and navigation of documents, including those that are malformed, incomplete, or inconsistent (i.e. the document can remain in that state indefinitely). We have integrated Harmonia into XEmacs by creating a new Emacs "mode" that provides interactive, on-line services to the end user in the program composition, editing and navigation process.

  • image

    StarLogo, StarLogo TNG

    Massachusetts Institute of Technology

    Parallel Programming for Kids!

    StarLogo is a programmable modeling environment for exploring the workings of decentralized systems -- systems that are organized without an organizer, coordinated without a coordinator. With StarLogo, you can model (and gain insights into) many real-life phenomena, such as bird flocks, traffic jams, ant colonies, and market economies.

    In decentralized systems, orderly patterns can arise without centralized control. Increasingly, researchers are choosing decentralized models for the organizations and technologies that they construct in the world, and for the theories that they construct about the world. But many people continue to resist these ideas, assuming centralized control where none exists -- for example, assuming (incorrectly) that bird flocks have leaders. StarLogo is designed to help students (as well as researchers) develop new ways of thinking about and understanding decentralized systems.

    StarLogo is a specialized version of the Logo programming language. With traditional versions of Logo, you can create drawings and animations by giving commands to graphic "turtles" on the computer screen. StarLogo extends this idea by allowing you to control thousands of graphic turtles in parallel. In addition, StarLogo makes the turtles' world computationally active: you can write programs for thousands of "patches" that make up the turtles' environment. Turtles and patches can interact with one another -- for example, you can program the turtles to "sniff" around the world, and change their behaviors based on what they sense in the patches below. StarLogo is particularly well-suited for Artificial Life projects.

    StarLogo TNG is The Next Generation of StarLogo modeling and simulation software. While this version holds true to the premise of StarLogo as a tool to create and understand simulations of complex systems, it also brings with it several advances - 3D graphics and sound, a blocks-based programming interface, and keyboard input - that make it a great tool for programming educational video games.

    Through TNG we hope to:

    1. Lower the barrier to entry for programming with a graphical interface where language elements are represented by colored blocks that fit together like puzzle pieces.
    2. Entice more young people into programming through tools that facilitate making games.
    3. Use 3D graphics to make more compelling and rich games and simulation models.

Filter by type:

All topics
  • All topics
  • Accessibility
  • Biometrics
  • Human Aspects
  • CS Education
  • Programming Languages
  • Tools
  • Agile Methods
  • Social Media
  • Global Software Development
  • Panels
  • Posters
  • Workshops
  • Invited Talks
  • Dissertations
Sort by year:

Neurodiversity and the Accessible University: Exploring Organizational Barriers, Access Labor and Opportunities for Change

Borsotti, Valeria and Begel, Andrew and Bjørn, Pernille
Accessibility Proceedings of the ACM 2024 Conference on Computer Supported Cooperative Work. San Jose, Costa Rica. November 2024.

Abstract

The access needs of neurodivergent individuals in organizational settings are many and varied — and so are their everyday contributions to the creation of collective access. In this study, we contribute to the growing body of CSCW research on accessibility and investigate the invisible access labor of neurodivergent students in three computer science institutions. We use an exploratory, multi-stakeholder approach, combining semi-structured interviews (n=26) and document analysis. We adopted a broad definition of neurodiversity: our study included individuals with autism, dyslexia, ADHD, cyclothymia and individuals with neurological conditions that developed as a result of illness, trauma or injury. Our findings show that neurodivergent students face a number of structural and attitudinal barriers to access in the educational environment and within the disability support system. We identified barriers in three main areas: (i) assistive technology access barriers, (ii) cognitive and physical access barriers, and (iii) social access barriers. We examined how stigma, individualized understandings of disability and intersectional disadvantage shape organizational practices and explored how students are creatively improving collective access through micro-interventions, although these efforts are largely invisible. We then draw on our findings to identify opportunities for change. We propose access grafting as a bottom-up approach to rethinking and reorienting organizational strategies to improve equitable access.

"It's the only thing I can trust": Envisioning Large Language Model Use by Autistic Workers for Communication Assistance

Jang, Jiwoong and Moharan, Sanika and Carrigton, Patrick and Begel, Andrew
Accessibility Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. Honolulu, HI, USA. May 2024.

Abstract

Autistic adults often experience stigma and discrimination at work, leading them to seek social communication support from coworkers, friends, and family despite emotional risks. Large language models (LLMs) are increasingly considered an alternative. In this work, we investigate the phenomenon of LLM use by autistic adults at work and explore opportunities and risks of LLMs as a source of social communication advice. We asked 11 autistic participants to present questions about their own workplace-related social difficulties to (1) a GPT-4-based chatbot and (2) a disguised human confederate. Our evaluation shows that participants strongly preferred LLM over confederate interactions. However, a coach specializing in supporting autistic job-seekers raised concerns that the LLM was dispensing questionable advice. We highlight how this divergence in participant and practitioner attitudes reflects existing schisms in HCI on the relative privileging of end-user wants versus normative good and propose design considerations for LLMs to center autistic experiences.

Are Robots Ready to Deliver Autism Inclusion?: A Critical Review

Rizvi, Naba and Wu, William and Bolds, Mya and Mondal, Raunak and Begel, Andrew and Munyaka, Imani N. S.
Accessibility Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. Honolulu, HI, USA. May 2024.

Abstract

The marginalization of autistic people in our society today is multi-faceted as it includes violence that is both physical and ideological in nature. It is rooted in the dehumanization, infantilization, and masculinization of autistic people and pervasive even in contemporary research studies that continue to echo ableist ideologies from the past. In this work, we identify how HRI research reproduces systemic social inequalities and explain how they align with historical misrepresentations, and other systemic barriers. We analyzed 142 papers focusing on HRI and autism published between 2016 and 2022. We critique thethese studies through a mixed-methods analysis of their definition of autism, study designs, participant recruitment, and results. Our findings indicate that HRI research stigmatizes autism in three dimensions - 1) the pathologization of autism, 2) gender and age-based essentialism, and 3) power imbalances. Our work uncovered that about 90% of HRI research during the timeline explored excluded the perspectives of autistic people, particularly those from understudied groups. We recommend broadening the inclusion of autistic people, considering research objectives beyond clinical use, and diversifying collaborations, foundational works considered, & participant demographics for more inclusive future work.

Towards Inclusive Source Code Readability Based on the Preferences of Programmers with Visual Impairments

Pandey, Maulishree and Oney, Steve and Begel, Andrew
Accessibility Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. Honolulu, HI, USA. May 2024.

Abstract

Code readability is crucial for program comprehension, maintenance, and collaboration. However, many of the standards for writing readable code are derived from sighted developers' readability needs. We conducted a qualitative study with 16 blind and visually impaired (BVI) developers to better understand their readability preferences for common code formatting rules such as identifier naming conventions, line length, and use of indentation. Our findings revealed how BVI developers' preferences contrast with those of sighted developers and how we can expand the existing rules to improve code readability on screen readers. Based on the findings, we contribute an inclusive understanding of code readability and derive implications for programming languages, development environments, and style guides. Our work helps broaden the meaning of readable code in software engineering and accessibility research.

Mixed Abilities and Varied Experiences: A Group Autoethnography of a Virtual Summer Internship

Mack, Kelly and Das, Maitraye and Jain, Dhruv and Bragg, Danielle and Tang, John and Begel, Andrew and Beneteau, Erin and Davis, Josh Urban and Glasser, Abraham and Park, Joon Sung and Potluri, Venkatesh
Accessibility Communications of the ACM, 66 (8), 105-113. jul 2023.

Abstract

The COVID-19 pandemic forced many people to convert their daily work lives to a "virtual" format, in which they connected remotely from home. In this new, virtual environment, accessibility barriers changed, in some respects for the better (e.g., more flexibility) and in other aspects, for the worse (e.g., problems including American Sign Language interpreters over video calls). Microsoft Research held its first cohort of all virtual interns in 2020. We, the interns, full-time members, and affiliates of the Ability Team, a Microsoft research team focused on accessibility, report on our experiences navigating the challenges of working remotely. We constituted a variety of abilities, positions, and levels of seniority. Using an autoethnographic method, we provide a nuanced view of how the virtual setting affected the experiences of our mixed-ability team, the strategies we used to improve access, and areas for further improvement. We close by presenting guidelines for future virtual mixed-ability teams to improve workplace accessibility.

2023 5th Research Workshop on Neurodiversity at Work

Begel, Andrew and Annabi, Hala and Dow-Burger, Kathryn
AccessibilityWorkshop 5th Annual Neurodiversity at Work Research Workshop. June 2023.

Abstract

The Neurodiversity at Work Research Workshop brings together leading scholars, neurodivergent leaders, neurodiversity practitioners, and leading neurodiversity employers concerned with advancing neurodiversity employment research. Their work may relate to the preparation, recruitment, persistence, and advancement of neurodivergent individuals in the workplace. Any individual interested in learning about neurodiversity employment research is welcome to attend! This year, the workshop will focus on questions surrounding sustaining efforts that support effective and comprehensive approaches to education and employment that improve employment outcomes and the well-being of the neurodivergent community. Our goals are to: Build a community of people concerned with research related to the preparation and employment of neurodistinct individuals and convey our discoveries to others in the community Provide a collaborative space for scholars to share their work and receive constructive feedback in order to advance neurodiversity employment research Further develop a research agenda to advance evidence-based practices to equitably include neurodivergent people in the workplace.

2024 6th Research Conference on Neurodiversity at Work

Begel, Andrew and Annabi, Hala and Dow-Burger, Kathryn
AccessibilityWorkshop 6th Annual Neurodiversity at Work Research Workshop. May 2023.

Abstract

The Neurodiversity at Work Research Conference brings together leading scholars, neurodivergent leaders, neurodiversity practitioners, and leading neurodiversity employers concerned with advancing neurodiversity employment research. Their work may relate to the preparation, recruitment, persistence, and advancement of neurodivergent individuals in the workplace. Our goal remains to build a community of people concerned with advancing research related to the preparation and employment of autistic individuals and convey these concerns to others in the community. The conference aims to 1) Provide a collaborative space for scholars to share their work and receive constructive feedback in order to advance neurodiversity employment research. 2) Further develop a research agenda to advance evidence-based practices to equitably include neurodivergent people in the workplace.

CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Software Development

Potluri, Venkatesh and Pandey, Maulishree and Begel, Andrew and Barnett, Michael and Reitherman, Scott
AccessibilityHuman Aspects Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility. Athens, Greence. 2022.

Abstract

COVID-19 accelerated the trend toward remote software development, increasing the need for tightly-coupled synchronous collaboration. Existing tools and practices impose high coordination overhead on blind or visually impaired (BVI) developers, impeding their abilities to collaborate effectively, compromising their agency, and limiting their contribution. To make remote collaboration more accessible, we created CodeWalk, a set of features added to Microsoft's Live Share VS Code extension, for synchronous code review and refactoring. We chose design criteria to ease the coordination burden felt by BVI developers by conveying sighted colleagues' navigation and edit actions via sound effects and speech. We evaluated our design in a within-subjects experiment with 10 BVI developers. Our results show that CodeWalk streamlines the dialogue required to refer to shared workspace locations, enabling participants to spend more time contributing to coding tasks. This design offers a path towards enabling BVI and sighted developers to collaborate on more equal terms.

Program-L: Online Help Seeking Behaviors by Blind and Low Vision Programmers

Johnson, Jazette and Begel, Andrew and Ladner, Richard and Ford, Denae
AccessibilityHuman Aspects 2022 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). Rome, Italy. 2022.

Abstract

Although the number of blind or low vision (BLV) software developers is the largest minority population of developers with physical disabilities, they are often marginalized in mainstream online programming communities. We studied how BLV developers engage with a BLV-specific programming community called Program-L, by exploring the help-seeking behaviors of novices. We analyzed 173 messages written by 20 novices over a 4-year period and identified the kinds of help they asked for and their justifications for requesting that help. We learned that self disclosure, practical assistance, and community dynamics were all critical activities to support four types of novices: community, domain, programming, and accessibility. The findings of our work give insight into what support can look like for online communities for marginalized software developers.

2022 4th Research Workshop on Autism at Work

Begel, Andrew and Annabi, Hala and Dow-Burger, Kathryn
AccessibilityWorkshop 4th Annual Autism at Work Research Workshop. May 2022.

Abstract

The Autism at Work Research Workshop brings together leading scholars, employers, clinicians, service providers, entrepreneurs, caregivers, and autism advocates concerned with autism employment. Their work may relate to the preparation, recruitment, persistence, advancement, and management of autistic individuals in the workplace. Our objectives are to: Build a community of people concerned with issues related to the preparation and employment of autistic individuals and convey these concerns to others in the community Offer opportunities to connect practitioners with researchers to develop or evaluate supports for the employment of autistic individuals Provide a collaborative space for scholars to share their work and receive constructive feedback in order to advance autism employment research Further develop a research agenda to advance evidence-based practices to equitably include individuals with autism in the workplace

``Can You Help Me?'' An Experience Report of Teamwork in a Game Coding Camp for Autistic High School Students

Moster, Makayla and Kokinda, Ella and Re, Matthew and Dominic, James and Lehmann, Jason and Begel, Andrew and Rodeghero, Paige
Accessibility Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice. Pittsburgh, PA. 2022.

Abstract

Teamwork skills are increasingly important for students to have as they enter the workforce, especially in software development positions. However, autistic students do not get to practice teamwork since much of their education is focused on learning social skills. The hybrid mode of education comes with challenges, including communication and collaboration issues and teaming difficulties, however, this method of teaching and learning can be difficult for students with autism. In this experience report paper, we discuss our experience planning and running a hybrid camp to teach teamwork and programming to 14 autistic high school students. Overall, our camp was successful in teaching students software development skills with open source software, and, from our experience, we detail our lessons learned and provide recommendations for educators and researchers working with autistic students in a hybrid setting.

Mixed Abilities and Varied Experiences: a group autoethnography of a virtual summer internship

Mack, Kelly and Das, Maitraye and Jain, Dhruv and Bragg, Danielle and Tang, John and Begel, Andrew and Beneteau, Erin and Davis, Josh Urban and Glasser, Abraham and Park, Joon Sung and Potluri, Venkatesh
Accessibility Proceedings of ASSETS ‘21: SIGACCESS Conference on Computers and Accessibility. October 2021.

Abstract

The COVID-19 pandemic forced many people to convert their daily work lives to a "virtual" format where everyone connected remotely from their home. In this new, virtual environment, accessibility barriers changed, in some respects for the better (e.g., more flexibility) and in other aspects, for the worse (e.g., problems including American Sign Language interpreters over video calls). Microsoft Research held its first cohort of all virtual interns in 2020. We the authors, full time and intern members and affiliates of the Ability Team, a research team focused on accessibility, reflect on our virtual work experiences as a team consisting of members with a variety of abilities, positions, and seniority during the summer intern season. Through our autoethnographic method, we provide a nuanced view into the experiences of a mixed-ability, virtual team, and how the virtual setting affected the team’s accessibility. We then reflect on these experiences, noting the successful strategies we used to promote access and the areas in which we could have further improved access. Finally, we present guidelines for future virtual mixed-ability teams looking to improve access.

Inclusive Interpersonal Communication Education for Technology Professionals

Rizvi, Naba and Begel, Andrew and Annabi, Hala
Accessibility Proceedings of the 27th Americas Conference on Information Systems. August 2021.

Abstract

More technology organizations have turned to autistic people to meet their talent needs the creation of autism-specific hiring programs. Despite the potential of such programs, early research indicates that autistic employees and their neurotypical coworkers face communication challenges due to pronounced differences in styles and preferences. These difficulties lead to breakdowns in communication, collaboration, and coordination which can leave autistic employees feeling isolated and stigmatized. To increase the knowledge and improve attitudes of neurotypical employees about autism, we created and evaluated a training module for neurotypical employees about effectively communicating with their autistic colleagues, thus flipping the traditional burden to adapt from autistic workers to their non-autistic colleagues. Our formative results show that people who took the course increased their knowledge about their own and the communication styles and preferences of autistic people. They acquired skills on how to negotiate these style differences before engaging in conversations with their colleagues.

2021 3rd Research Workshop on Autism at Work

Begel, Andrew and Annabi, Hala and Dow-Burger, Kathryn
AccessibilityWorkshop 3rd Annual Autism at Work Research Workshop. April 2021.

Abstract

The Autism at Work Research Workshop brings together leading scholars, employers, clinicians, service providers, entrepreneurs, caregivers, and autism advocates concerned with autism employment. Their work may relate to the preparation, recruitment, persistence, advancement, and management of autistic individuals in the workplace. Our objectives are to build a community of people concerned with issues related to the preparation and employment of autistic individuals and convey these concerns to the others in the community, offer opportunities to connect practitioners with researchers to develop or evaluate supports for the employment of autistic individuals, provide a collaborative space for scholars to share their work and receive constructive feedback in order to advance autism employment research, and further develop a research agenda to advance evidence-based practices to equitably include individuals with autism in the workplace.

How a Remote Video Game Coding Camp Improved Autistic College Students’ Self-Efficacy in Communication

Begel, Andrew and Dominic, James and Phillis, Conner and Beeson, Thomas and Rodeghero, Paige
CS EducationAccessibility Proceedings of the 52nd SIGCSE Technical Symposium on Computer Science Education. Canada (Online). March 2021.

Abstract

Communication and teamwork are essential skills for software developers. However, these skills are often difficult to learn for students with autism spectrum disorder (ASD). We designed, developed, and ran a 13-day, remote video game coding camp for incoming college first-year students with ASD. We developed instructional materials to teach computer programming, video game design, and communication and teaming skills. Students used the MakeCode Arcade development environment to build their games and Zoom to remotely collaborate with their teammates. In summative interviews, students reported improved programming skills, increased confidence in communication, and better experiences working with others. We also found that students valued the opportunity to practice teaming, such as being more vocal in expressing ideas to their peers and working out differences of opinion with their teammates. Two students reported the remote learning environment decreased their anxiety and stress, both are frequent challenges for autistic people. We plan to rerun the camp next year with materials that we have made available online.

Accessible Computing Education in Colleges and Universities

Baker, Catie and Begel, Andrew and Butler, Matthew and Caspi, Anat and Ghazal, Ramy and Kingston, Neal, and Lewis, Clayton and Lewis, Colleen and Mack, Kelly and Mbari-Kirika, Irene and Ohshiro, Keita and Rodeghero, Paige and Shinohara, Kristen and Smith, Julie and Srivastava, Namrata and Steele, Kat and Tamjeed, Murtaza and Tang, John and Tesfay, Adiam and Yamagami, Momona
Accessibility Accessible Computer Science Education Fall Workshop. November 2020.

Abstract

Students with disabilities should be supported in all aspects of higher education, not only in their classroom work, but also their research, their participation in extracurricular activities, and other aspects of student life. Further, the process of obtaining access and accommodations should not add to the challenges that students with disabilities ace on campus. Realizing these goals require addressing a wide range of problems and opportunities, as we outline in this white paper. In this document, we discuss how the culture of computing as a discipline can and should change to better realize the vision of full participation by students with disabilities, as well as to promote greater contributions by computer scientists to the technology of accessibility. Our discussion is framed around this central idea of developing accessibility as a cultural competency in computing, rather than smaller, individual efforts to tackle inclusion within the discipline. We then frame these challenges and opportunities from the perspective of students and faculty.

Accessible Computer Science Education Fall Workshop

Begel, Andrew and Caspi, Anat and Dowdy, Heather and Ladner, Richard and Lewis, Clayton and Morrison, Cecily and Seyed, Teddy and Zimmermann, Roy
AccessibilityWorkshop Accessible Computer Science Education Fall Workshop. November 2020.

Abstract

The Accessible Computer Science Education Fall Workshop will be three half-days of talks, discussions and planning for new research dedicated to making Computer Science education learning experiences more accessible for people with disabilities. At this event, we will establish a research driven coalition of civic and academic technologists and practitioners to envision what research is needed to develop tools, services, and ecosystems to make Computer Science education more accessible. Together we will develop high impact research and action plans driving toward deployments.

Lessons Learned in Designing AI for Autistic Adults

Begel, Andrew and Tang, John and Andrist, Sean and Barnett, Mike and Carbary, Tony and Choudhury, Piali and Cutrell, Ed and Fung, Alberto and Junuzovic, Sasa and McDuff, Daniel and Rowan, Kael and Sahoo, Shibashankar and Waldern, Jennifer Frances and Wolk, Jessica and Zheng, Hui and Zolyomi, Annuska
Accessibility Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility. Athens, Greece (Online). October 2020.

Abstract

Through an iterative design process using Wizard of Oz (WOz) prototypes, we designed a video calling application for people with Autism Spectrum Disorder. Our Video Calling for Autism prototype provided an Expressiveness Mirror that gave feedback to autistic people on how their facial expressions might be interpreted by their neurotypical conversation partners. This feedback was in the form of emojis representing six emotions and a bar indicating the amount of overall expressiveness demonstrated by the user. However, when we built a working prototype and conducted a user study with autistic participants, their negative feedback caused us to reconsider how our design process led to a prototype that they did not find useful. We reflect on the design challenges around developing AI technology for an autistic user population, how Wizard of Oz prototypes can be overly optimistic in representing AI-driven prototypes, how autistic research participants can respond differently to user experience prototypes of varying fidelity, and how designing for people with diverse abilities needs to include that population in the development process.

A practical guide on conducting eye tracking studies in software engineering

Sharafi, Zohreh and Sharif, Bonita and Guéhneuc, Yann-Gaël and Begel, Andrew and Bednarik, Roman and Crosby, Martha
BiometricsHuman Aspects Empirical Software Engineering, 25 (5), 3128--3174. June 2020.

Abstract

For several years, the software engineering research community used eye trackers to study program comprehension, bug localization, pair programming, and other software engineering tasks. Eye trackers provide researchers with insights on software engineers’ cognitive processes, data that can augment those acquired through other means, such as on-line surveys and questionnaires. While there are many ways to take advantage of eye trackers, advancing their use requires defining standards for experimental design, execution, and reporting. We begin by presenting the foundations of eye tracking to provide context and perspective. Based on previous surveys of eye tracking for programming and software engineering tasks and our collective, extensive experience with eye trackers, we discuss when and why researchers should use eye trackers as well as how they should use them. We compile a list of typical use cases—real and anticipated—of eye trackers, as well as metrics, visualizations, and statistical analyses to analyze and report eye-tracking data. We also discuss the pragmatics of eye tracking studies. Finally, we offer lessons learned about using eye trackers to study software engineering tasks. This paper is intended to be a one-stop resource for researchers interested in designing, executing, and reporting eye tracking studies of software engineering tasks.

Affect Recognition in Code Review: An In-situ Biometric Study of Reviewer's Affect

Vrzakova, Hana and Begel, Andrew and Mehtätalo, Lauri and Bednarik, Roman
BiometricsHuman Aspects Journal of Systems and Software, 159 (1), 110434. 2020.

Abstract

Code reviews are an important practice in software development that increases team productivity and improves product quality. They are also examples of remote, computer-mediated asynchronous communications which are prone to the loss of affective information. Prior research has focused on sentiment analysis in source codes, as positive affect has been linked to developer productivity. Although methods of sentiment analysis have advanced, challenges remain due to numerous domain-specific expressions, subtle nuance, and indications of sentiment. In this paper, we uncover the potential for 1) nonverbal behavioral signals such as conventional typing, and 2) indirect physiological measures (eye gaze, GSR, touch pressure) to reveal genuine affective states in in situ code review in a large software company. Nonverbal behavioral signals of 33 professional software developers were recorded unobtrusively while they worked on their daily code reviews. After analyzing these signals using Linear Mixed Effect Models, we observe that affect presented in the written comments is associated with prolonged typing duration. Using physiological features, a trained Random Forest classifier can predict post-task valence with 90.0% accuracy (F1-score = 0.937) and arousal with 83.9% accuracy (F1-score = 0.856). The results show promise for the creation of intelligent affect-aware interfaces for code review.

What Distinguishes Great Software Engineers?

Li, Paul Luo and Ko, Amy J. and Begel, Andrew
Human Aspects Empirical Software Engineering, 25 (1), 322--352. dec 2019.

Abstract

Great software engineers are essential to the creation of great software. However, today, we lack an understanding of what distinguishes great engineers from ordinary ones. We address this knowledge gap by conducting one of the largest mixed-method studies of experienced engineers to date. We surveyed 1,926 expert engineers, including senior engineers, architects, and technical fellows, asking them to judge the importance of a comprehensive set of 54 attributes of great engineers. We then conducted 77 email interviews to interpret our findings and to understand the influence of contextual factors on the ratings. After synthesizing the findings, we believe that the top five distinguishing characteristics of great engineers are writing good code, adjusting behaviors to account for future value and costs, practicing informed decision-making, avoiding making others’ jobs harder, and learning continuously. We relate the findings to prior work, and discuss implications for researchers, practitioners, and educators.

Managing Stress: The Needs of Autistic Adults in Video Conferencing

Zolyomi, Annuska and Begel, Andrew and Waldern, Jennifer Frances and Tang, John and Barnett, Mike and Cutrell, Edward and McDuff, Daniel and Andrist, Sean and Morris, Meredith Ringel
Accessibility Proceedings of the ACM 2019 Conference on Computer Supported Cooperative Work. Austin, Texas, USA. November 2019.

Abstract

Video calling (VC) aims to create multi-modal, collaborative environments that are ``just like being there.'' However, we found that autistic individuals, who exhibit atypical social and cognitive processing, may not share this goal. We interviewed autistic adults about their perceptions of VC compared to other computer-mediated communications (CMC) and face-to-face interactions. We developed a neurodiversity-sensitive model of CMC that describes how stressors such as sensory sensitivities, cognitive load, and anxiety, contribute to their preferences for CMC channels. We learned that they apply significant effort to construct coping strategies to support their sensory, cognitive, and social needs. These strategies include moderating their sensory inputs, creating mental models of conversation partners, and attempting to mask their autism by adopting neurotypical behaviors. Without effective strategies, interviewees experience more stress, have less capacity to interpret verbal and non-verbal cues, and feel less empowered to participate. Our findings reveal critical needs for autistic users. We suggest design opportunities to support their ability to comfortably use VC, and in doing so, point the way towards making VC more comfortable for all.

Summary of the Sixth Edition of the International Workshop on Eye Movements in Programming

Siegmund, Janet and Begel, Andrew and Peitek, Norman
BiometricsHuman Aspects SIGSOFT Software Engineering Notes, 44 (3), 54–55. November 2019.

Abstract

The study of eye gaze data has great potential for research in computer programming, computing education, and software engineering practice. To highlight its role for the software engineering community, the Sixth Edition of the International Workshop on Eye Movements in Programming (EMIP 2019) was co-located with the 41st International Conference on Software Engineering. The goal of the workshop was to advance the methodology of using eye tracking for programming, both theoretically and in applications.

Introduction to the Special Issue on Affect Awareness in Software Engineering

Novielli, Nicole and Begel, Andrew and Maalej, Walid
Human AspectsBiometrics Journal of Systems and Software, 148 (2), 180-182. 2019.

Best Practices for Engineering AI-infused Applications: Lessons Learned from Microsoft Teams

Begel, Andrew
Human AspectsInvited Talk Proceedings of the Joint 7th International Workshop on Conducting Empirical Studies in Industry and 6th International Workshop on Software Engineering Research and Industrial Practice. Montreal, Quebec, Canada. 2019.

Abstract

Artificial intelligence and machine learning (AI/ML) are some of the newest trends to hit the software industry, compelling organizations to evolve their development processes to deliver novel products to their customers. In this talk, I describe a study in which we learned how Microsoft software teams develop AI/ML-based applications using a nine-stage AI workflow process informed by prior experiences developing early AI applications (e.g. search and NLP) and data science tools (e.g. application telemetry and bug reporting). Adapting this workflow into their pre-existing, well-evolved, Agile-like software engineering processes and job roles has resulted in a number of engineering challenges unique to the AI/ML domain, some universal to all teams, but others related to the amount of prior AI/ML experience and education the teams have. I tell you about some challenges and the solutions that teams have come up with. The lessons that Microsoft has learned can help other organizations embarking on their own path towards AI and ML.

2019 2nd Research Workshop on Autism at Work

Annabi, Hala and Begel, Andrew and Fung, Lawrence
Accessibility 2nd Annual Autism at Work Research Workshop. May 2019.

Abstract

The Autism at Work Research Workshop brings together a small select group of leading scholars concerned with autism employment. Their work may relate to the preparation, recruitment, persistence, and advancement of individuals with autism in the workplace. Our objectives are to 1, build a community of scholars and practitioners concerned with issues related to the preparation and employment of individuals with autism; 2, provide a collaborative space for scholars to share their work and receive constructive feedback in order to advance autism employment research; and 3, further develop a research agenda to advance evidence-based practices to equitably include individuals with autism in the workplace.

6th International Workshop on Eye Movements in Programming (EMIP 2019)

Begel, Andrew and Siegmund, Janet
BiometricsWorkshop 2019 ACM 6th International Workshop on Eye Movements in Programming (EMIP). May 2019.

Abstract

Welcome to the 6th International Workshop on Eye Movements in Programming (EMIP), co-located with the 41st International Conference on Software Engineering (ICSE 2019) in Montreal, Canada. The study of eye gaze data has great potential for research in computer programming, computing education, and software engineering practice. The Sixth International Workshop on Eye Movements in Programming (EMIP 2019) will focus on advancing the methodological, theoretical, and applied aspects of eye movements in programming. The goal of the workshop is to advance the methodology of using eye gaze tracking for programming, both theoretically and in applications. What can gaze behavior tell us about cognitive processes during programming? How can eye tracking help us to understand the role of human factors in software engineering? The workshop will host a keynote by Dror Feitelson from the Hebrew University of Jerusalem, Israel. In his talk, Prof. Feitelson will survey achievements and suggest future directions about how eye tracking, especially regarding where programmers look, for how long, and how much mental effort they exert, provides crucial data on how reading and comprehending code differs from reading and comprehending regular text. Furthermore, this year's EMIP will have 7 presentations of accepted papers, mixed with interactive poster and hands-on demos. Thus, the entire workshop will have a focus on discussion and community building.

Software Engineering for Machine Learning: A Case Study

Amershi, Saleema and Begel, Andrew and Bird, Christian and DeLine, Robert and Gall, Harald and Kamar, Ece and Nagappan, Nachiappan and Nushi, Besmira and Zimmermann, Thomas
Human Aspects Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice. Montreal, Quebec, Canada. 2019.

Abstract

Recent advances in machine learning have stimulated widespread interest within the Information Technology sector on integrating AI capabilities into software and services. This goal has forced organizations to evolve their development processes. We report on a study that we conducted on observing software teams at Microsoft as they develop AI-based applications. We consider a nine-stage workflow process informed by prior experiences developing AI applications (e.g., search and NLP) and data science tools (e.g. application diagnostics and bug reporting). We found that various Microsoft teams have united this workflow into preexisting, well-evolved, Agile-like software engineering processes, providing insights about several essential engineering challenges that organizations may face in creating large-scale AI solutions for the marketplace. We collected some best practices from Microsoft teams to address these challenges. In addition, we have identified three aspects of the AI domain that make it fundamentally different from prior software application domains: 1) discovering, managing, and versioning the data needed for machine learning applications is much more complex and difficult than other types of software engineering, 2) model customization and model reuse require very different skills than are typically found in software teams, and 3) AI components are more difficult to handle as distinct modules than traditional software components -- models may be "entangled" in complex ways and experience non-monotonic error behavior. We believe that the lessons learned by Microsoft teams will be valuable to other organizations.

Cambridge Handbook of Computing Education Research

Begel, Andrew and Ko, Amy
CS Education February 2019.

Abstract

The history of computing education research is replete with studies about learning in formal contexts, i.e. students learning from teachers in school classrooms. In this chapter, we explore other contexts in which learning about computing occurs, for example, through reading books, working through online tutorials, competing in hackathons, or asking and answering computing questions on a Q&A website. These activities are all examples of informal learning—learning that is opportunistic, rather than planned; unstructured, rather than pedagogically created; self-directed, rather than teacher-centric; and integrated authentically into life activities (Marsick & Watkins, 2001), rather than taking place in a classroom environment. We collect and synthesize research about informal learning of computing and discuss open questions around where and how it occurs, and how to best support it.

What Makes a Great Manager of Software Engineers?

Kalliamvakou, Eirini and Bird, Christian and Zimmermann, Thomas and Begel, Andrew and DeLine, Robert and German, Daniel M.
Human Aspects IEEE Transactions on Software Engineering, 45 (1), 87-106. January 2019.

Abstract

Having great managers is as critical to success as having a good team or organization. In general, a great manager is seen as fuelling the team they manage, enabling it to use its full potential. Though software engineering research studies factors that may affect the performance and productivity of software engineers and teams (like tools and skill), it has overlooked the software engineering manager. The software industry's growth and change in the last decades is creating a need for a domain-specific view of management. On the one hand, experts are questioning how the abundant work in management applies to software engineering. On the other hand, practitioners are looking to researchers for evidence-based guidance on how to manage software teams. We conducted a mixed methods empirical study of software engineering management at Microsoft to investigate what manager attributes developers and engineering managers perceive important and why. We present a conceptual framework of manager attributes, and find that technical skills are not the sign of greatness for an engineering manager. Through statistical analysis we identify how engineers and managers relate in their views, and how software engineering differs from other knowledge work groups in its perceptions about what makes great managers. We present strategies for putting the attributes to use, discuss implications for research and practice, and offer avenues for further work.

Data Scientists in Software Teams: State of the Art and Challenges

Kim, Miryung and Zimmermann, Thomas and DeLine, Robert and Begel, Andrew
Human Aspects IEEE Transactions on Software Engineering, 44 (11), 1024-1038. November 2018.

Abstract

The demand for analyzing large scale telemetry, machine, and quality data is rapidly increasing in software industry. Data scientists are becoming popular within software teams, e.g., Facebook, LinkedIn and Microsoft are creating a new career path for data scientists. In this paper, we present a large-scale survey with 793 professional data scientists at Microsoft to understand their educational background, problem topics that they work on, tool usages, and activities. We cluster these data scientists based on the time spent for various activities and identify 9 distinct clusters of data scientists, and their corresponding characteristics. We also discuss the challenges that they face and the best practices they share with other data scientists. Our study finds several trends about data scientists in the software engineering context at Microsoft, and should inform managers on how to leverage data science capability effectively within their teams.

3rd International Workshop on Emotion Awareness in Software Engineering (SEmotion 2018)

Begel, Andrew and Serebrenik, Alexander and Graziotin, Daniel
BiometricsWorkshop 2018 IEEE/ACM 3rd International Workshop on Emotion Awareness in Software Engineering (SEmotion). May 2018.

Abstract

SEmotion 2018 Workshop Summary. Welcome to the 3rd International Workshop on Emotion Awareness in Software Engineering (SEmotion 2018)! This workshop, held at ICSE 2018, follows the second edition held at ICSE 2017. The workshop's aim is to create an international, sustainable forum for researchers and practitioners to meet, present, and discuss work on the role of affect and emotion in software engineering. Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affect, i.e. the experience of feeling or emotion. In the last decade, research has shown how affective states influence work performance and team collaboration. This also applies to software engineering, an inherently collaborative activity involving people in a broad range of collaborative tasks where personality, moods, and emotions play crucial roles. To ensure the success of software engineering projects, stakeholders must experience positive affect, agree on display rules for emotions, and share mutual commitment towards project goals. By leveraging emotion awareness in software engineering, we can enhance development performance, improve software quality, help regulate the mood of a project team, and promote fruitful interactions between software engineering stakeholders. SEmotion 2018 addresses the opportunities and challenges of employing affective computing in software engineering. First, we investigate the impact of affective states (emotions, moods, attitudes, personality traits, etc.) on individual and group performance, commitment, and collaboration in software engineering. Second, we foster discussion on issues posed by exploiting affective computing as a new method for empirical software engineering.

A Study of the Organizational Dynamics of Software Teams

Hilton, Michael and Begel, Andrew
Human Aspects Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. Gothenburg, Sweden. 2018.

Abstract

Large-scale software is developed by teams of engineers that work together. The teams' compositions change all the time, with engineers continuously leaving and joining. Learning about these organizational dynamics is vital to understanding how engineers acquire technical skills and business relationships throughout their career. In addition, since employee turnover can be costly to team morale and productivity, it is important for management to learn how to proactively guide the process. In this paper, we report on a study of a professional software development organization in which engineers switch teams frequently. We learned what causes engineers to consider leaving their teams, why they leave, how they learn about new teams, and how they decide which team to join. We also quantify the perceived costs and benefits of recent moves made by the engineers. In addition to reporting the answers to our research questions, we interpret our results to offer recommendations to engineers and their managers on how to ensure that both make better, happier team moves.

2018 Research Workshop on Autism at Work

Annabi, Hala and Begel, Andrew
Accessibility Autism at Work Research Workshop. April 2018.

Abstract

The Autism at Work Research Workshop brings together a small select group of leading scholars concerned with autism employment. Their work may relate to the preparation, recruitment, persistence, and advancement of individuals with autism in the workplace. Our objectives are to 1, build a community of scholars and practitioners concerned with issues related to the preparation and employment of individuals with autism; 2, develop a research agenda to advance evidence-based practices to equitably include individuals with autism in the workplace; and 3, identify types of interventions needed to create an inclusive workplace.

Eye Movements in Code Review

Begel, Andrew and Vrzakova, Hana
Proceedings of the Workshop on Eye Movements in Programming. Warsaw, Poland. 2018.

Abstract

In order to ensure sufficient quality, software engineers conduct code reviews to read over one another's code looking for errors that should be fixed before committing to their source code repositories. Many kinds of errors are spotted, from simple spelling mistakes and syntax errors, to architectural flaws that may span several files. However, we know little about how software developers read code when looking for defects. What kinds of code trigger engineers to check more deeply into suspected defects? How long do they take to verify whether a defect is really there? We conducted a study of 35 software engineers performing 40 code reviews while capturing their gaze with an eye tracker. We classified each code defect the developers found and captured the patterns of eye gazes used to deliberate about each one. We report how long it took to confirm defect suspicions for each type of defect and the fraction of time spent skimming the code vs. carefully reading it. This work provides a starting point for automating code reviews that could help engineers spend more time focusing on the difficult task of defect confirmation rather than the tedious task of defect discovery.

Cross-Disciplinary Perspectives on Collaborations with Software Engineers

Li, Paul Luo and Ko, Amy J. and Begel, Andrew
Human Aspects 2017 IEEE/ACM 10th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE). May 2017.

Abstract

Software engineering teams are usually interdisciplinary, consisting of both software engineers and non-software-engineers. While numerous studies have examined the success and failure of software engineering efforts from the perspective of software engineers, little is known about perspectives of expert non-software-engineers. In this study, we interviewed 46 experts across 10 roles at Microsoft (artists, content developers, data scientists, design researchers, designers, electrical engineers, mechanical engineers, product planners, program managers, service engineers) about their collaborations-good and bad-with software engineers. Overall, our experts described great software engineers as masters of their own technical domain, open-minded to the input of others, proactively informing everyone, and seeing the big picture of how pieces fit together. We discuss implications of our findings for practitioners, educators, and researchers.

Measuring Neural Efficiency of Program Comprehension

Siegmund, Janet and Peitek, Norman and Parnin, Chris and Apel, Sven and Hofmeister, Johannes and Kästner, Christian and Begel, Andrew and Bethmann, Anja and Brechmann, Andr'e
BiometricsHuman Aspects Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. Paderborn, Germany. 2017.

Abstract

Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as "beacons," for locating relevant code, and "plans," for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension.

2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion 2017)

Novielli, Nicole and Begel, Andrew and Maalej, Walid
BiometricsWorkshop 2017 IEEE/ACM 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion). May 2017.

Abstract

SEmotion 2017 Workshop Summary. Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affect, i.e. the experience of feelings or emotions. Over the past decade, research has shown the impact of affective states on work performance and team collaboration. Personality, moods, and emotions play crucial roles in software engineering because it involves people in a broad range of activities. For successful software engineering projects, stakeholders need to experience positive affect (such as trust or appreciation), to agree on display rules for emotions, and to hold mutual commitment to project goals. Recently, researchers have started to study the role of affective computing and affective states in software engineering, but contributions to this area are presented and discussed in too many different conferences and workshops. This workshop follows up from the first edition held at ICSE 2016. Its goal is to consolidate research and create an international, sustainable forum for researchers and practitioners interested in the role of affect in software engineering to meet, present, and discuss their work-in-progress. High-quality contributions related to empirical studies, theoretical models, and tools for supporting emotion awareness in software engineering are invited to the workshop, both from academia and industry.

Improving Communication Between Pair Programmers Using Shared Gaze Awareness

D'Angelo, Sarah and Begel, Andrew
BiometricsHuman Aspects Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. Denver, Colorado, USA. 2017.

Abstract

Remote collaboration can be more difficult than collocated collaboration for a number of reasons, including the inability to easily determine what your collaborator is looking at. This impedes a pair's ability to efficiently communicate about on-screen locations and makes synchronous coordination difficult. We designed a novel gaze visualization for remote pair programmers which shows where in the code their partner is currently looking, and changes color when they are looking at the same thing. Our design is unobtrusive, and transparently depicts the imprecision inherent in eye tracking technology. We evaluated our design with an experiment in which pair programmers worked remotely on code refactoring tasks. Our results show that with the visualization, pairs spent a greater proportion of their time concurrently looking at the same code locations. Pairs communicated using a larger ratio of implicit to explicit references, and were faster and more successful at responding to those references.

Keynote Talk: The Hitchhiker’s Guide to Engineering AI-Infused Applications

Begel, Andrew
Human AspectsInvited Talk 19th International Conference on Product-Focused Software Process Improvement. November 2018.

Abstract

Artificial intelligence and machine learning (AI/ML) are some of the newest trends to hit the software industry, compelling organizations to evolve their development processes to deliver novel products to their customers. In this talk, I will describe how Microsoft software teams develop AI/ML-based applications using a nine-stage AI workflow process informed by prior experiences developing early AI applications (e.g. search and NLP) and data science tools (e.g. application telemetry and bug reporting). Adapting this workflow into their pre-existing, well-evolved, Agile-like software engineering processes and job roles has resulted in a number of engineering challenges unique to the AI/ML domain, some universal to all teams, but others related to the amount of prior AI/ML experience and education the teams have. I will tell you about some challenges and the solutions that teams have come up with. I believe there are three challenges in the AI/ML domain that make it fundamentally different from prior software engineering application domains: Discovering, managing, and versioning the data needed to power AI/ML is much more complex and difficult than other types of software engineering, AI/ML model customization and reuse practices require very different skills than are typically found in software teams, and AI components do not modularize like software components — models may be “entangled” in complex ways and experience non-monotonic error behavior. The lessons that Microsoft has learned can help other organizations embarking on their own path towards AI and ML.

Guest editor's introduction to the Special Issue on Program Comprehension (ICPC 2014)

Roy, Chanchal K and Begel, Andrew and Moonen, Leon
Programming Languages Journal of Software: Evolution and Process, 28 (10), 838--839. 2016.

Keynote Talk: The ABCs of Software Engineering: Affect, Biometrics, and Cognition

Begel, Andrew
Global Software DevelopmentBiometricsInvited Talk Global Software Engineering (ICGSE), 2016 IEEE 11th International Conference on. August 2016.

Abstract

Researchers have long investigated how people read, write, and speak about software on their computers to identify the skills, education, and practices needed need to acquire expertise and perform development duties effectively and efficiently. However, until now the methods used to study developer comprehension, expression, and communication have been limited and coarse-grained because there was no way to identify what a developer thought or felt unless it was expressed out loud. The world has changed. With the introduction of low-cost, widely available, high-fidelity biometric sensors, we can now more directly observe a software developer's cognitive and affective (emotional) processes. The ABCs of Software Engineering is a set of techniques that modernize classic approaches to program comprehension and human interaction by combining (A) principles governing the influence of human *affect* on behavior, (B) *biometric* sensors, and (C) models of *cognition* informed by advances in cognitive neuroscience. Technologies like electroencephalography (EEG), electro-dermal activity sensors (EDA), capacitive sensors, and eye trackers can reveal a software developer's internal emotional states, for example identifying when the developer is confused, frustrated, surprised, stressed, fatigued, or in a highly productive flow state. These affective states can be correlated with code quality, software complexity, development productivity, and effective communication --- the same software outcomes already correlated with developer activities in other research areas such as mining software repositories (MSR) and cooperative and human aspects of software engineering (CHASE). By developing a better understanding of what programmers think and feel when they create and maintain software, we can design tools and interventions to improve their productivity and reduce the impact of their errors.

The Emerging Role of Data Scientists on Software Development Teams

Kim, Miryung and Zimmermann, Thomas and DeLine, Robert and Begel, Andrew
Human Aspects Proceedings of the 38th International Conference on Software Engineering. Austin, Texas. 2016.

Abstract

Creating and running software produces large amounts of raw data about the development process and the customer usage, which can be turned into actionable insight with the help of skilled data scientists. Unfortunately, data scientists with the analytical and software engineering skills to analyze these large data sets have been hard to come by; only recently have software companies started to develop competencies in software-oriented data analytics. To understand this emerging role, we interviewed data scientists across several product groups at Microsoft. In this paper, we describe their education and training background, their missions in software engineering contexts, and the type of problems on which they work. We identify five distinct working styles of data scientists: (1) Insight Providers, who work with engineers to collect the data needed to inform decisions that managers make; (2) Modeling Specialists, who use their machine learning expertise to build predictive models; (3) Platform Builders, who create data platforms, balancing both engineering and data analysis concerns; (4) Polymaths, who do all data science activities themselves; and (5) Team Leaders, who run teams of data scientists and spread best practices. We further describe a set of strategies that they employ to increase the impact and actionability of their work.

Hands-on Sensors 101: Invited Session

Parnin, Chris and Begel, Andrew
Biometrics Proceedings of the 1st International Workshop on Emotion Awareness in Software Engineering. Austin, Texas. 2016.

Abstract

This will be a one hour long practicum where attendees, guided by experts, will try out biometric sensors and equipment, including eye trackers, electrodermal activity sensors, and heart rate monitors. Attendees will write code to collect data from a sensor and analyze it to compute operationalized metrics like cognitive load. These will be applied to a simple research experiment. Attendees will gain the basic knowledge needed to engage in research with psycho-physiological sensors.

Understanding the Challenges Faced by Neurodiverse Software Engineering Employees: Towards a More Inclusive and Productive Technical Workforce

Morris, Meredith Ringel and Begel, Andrew and Wiedermann, Ben
Human Aspects Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. Lisbon, Portugal. 2015.

Abstract

Technology workers are often stereotyped as being socially awkward or having difficulty communicating, often with humorous intent; however, for many technology workers with atypical cognitive profiles, such issues are no laughing matter. In this paper, we explore the hidden lives of neurodiverse technology workers, e.g., those with autism spectrum disorder (ASD), attention deficit hyperactivity disorder (ADHD), and/or other learning disabilities, such as dyslexia. We present findings from interviews with 10 neurodiverse technology workers, identifying the challenges that impede these employees from fully realizing their potential in the workplace. Based on the interview findings, we developed a survey that was taken by 846 engineers at a large software company. In this paper, we reflect on the differences between the neurotypical (N = 781) and neurodiverse (N = 59) respondents. Technology companies struggle to attract, develop, and retain talented software developers; our findings offer insight into how employers can better support the needs of this important worker constituency.

8th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE 2015)

Begel, Andrew
Human AspectsWorkshop Proceedings of the 37th International Conference on Software Engineering. Florence, Italy. May 2015.

Abstract

Software is created for and with a wide range of stakeholders, from customers to management, from value-added providers to customer service personnel. These stakeholders work with teams of software engineers to develop and evolve software systems that support their activities. All of these people and their interactions are central to software development. Thus, it is crucial to investigate the dynamic and frequently changing Cooperative and Human Aspects of Software Engineering (CHASE), both before and after deployment, in order to understand current software practices, processes, and tools. In turn, this enables us to design tools and support mechanisms that improve software creation, software maintenance, and customer communication. Researchers and practitioners have long recognized the need to investigate these aspects, however, their articles are scattered across conferences and communities. This workshop will provide a unified forum for discussing high quality research studies, models, methods, and tools for human and cooperative aspects of software engineering. This will be the 8th in a series of workshops, which continue to be a meeting place for the academic, industrial, and practitioner communities interested in this area, and will give opportunities to present and discuss works-in-progress.

Eye movements in code reading: relaxing the linear order

Busjahn, Teresa and Bednarik, Roman and Begel, Andrew and Crosby, Martha and Paterson, James H. and Schulte, Carsten and Sharif, Bonita and Tamm, Sascha
BiometricsCS Education Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension. Florence, Italy. May 2015.

Abstract

Code reading is an important skill in programming. Inspired by the linearity that people exhibit while natural language text reading, we designed local and global gaze-based measures to characterize linearity (left-to-right and top-to-bottom) in reading source code. Unlike natural language text, source code is executable and requires a specific reading approach. To validate these measures, we compared the eye movements of novice and expert programmers who were asked to read and comprehend short snippets of natural language text and Java programs. Our results show that novices read source code less linearly than natural language text. Moreover, experts read code less linearly than novices. These findings indicate that there are specific differences between reading natural language and source code, and suggest that non-linear reading skills increase with expertise. We discuss the implications for practitioners and educators.

Eye Tracking in Computing Education

Busjahn, Teresa and Schulte, Carsten and Sharif, Bonita and Simon and Begel, Andrew and Hansen, Michael and Bednarik, Roman and Orlov, Paul and Ihantola, Petri and Shchekotova, Galina and Antropova, Maria
BiometricsCS Education Proceedings of the Tenth Annual Conference on International Computing Education Research. Glasgow, Scotland, United Kingdom. August 2014.

Abstract

The methodology of eye tracking has been gradually making its way into various fields of science, assisted by the diminishing cost of the associated technology. In an international collaboration to open up the prospect of eye movement research for programming educators, we present a case study on program comprehension and preliminary analyses together with some useful tools. The main contributions of this paper are (1) an introduction to eye tracking to study programmers; (2) an approach that can help elucidate how novices learn to read and understand programs and to identify improvements to teaching and tools; (3) a consideration of data analysis methods and challenges, along with tools to address them; and (4) some larger computing education questions that can be addressed (or revisited) in the context of eye tracking.

Analyze This! 145 Questions for Data Scientists in Software Engineering

Begel, Andrew and Zimmermann, Thomas
Human Aspects Proceedings of the 36th International Conference on Software Engineering. Hyderabad, India. June 2014.

Abstract

In this paper, we present the results from two surveys related to data science applied to software engineering. The first survey solicited questions that software engineers would like data scientists to investigate about software, about software processes and practices, and about software engineers. Our analyses resulted in a list of 145 questions grouped into 12 categories. The second survey asked a different pool of software engineers to rate these 145 questions and identify the most important ones to work on first. Respondents favored questions that focus on how customers typically use their applications. We also saw opposition to questions that assess the performance of individual employees or compare them with one another. Our categorization and catalog of 145 questions can help researchers, practitioners, and educators to more easily focus their efforts on topics that are important to the software industry.

Using Psycho-physiological Measures to Assess Task Difficulty in Software Development

Fritz, Thomas and Begel, Andrew and Müller, Sebastian C. and Yigit-Elliott, Serap and Züger, Manuela
Biometrics Proceedings of the 36th International Conference on Software Engineering. Hyderabad, India. June 2014.

Abstract

Software developers make programming mistakes that cause serious bugs for their customers. Existing work to detect problematic software focuses mainly on post hoc identification of correlations between bug fixes and code. We propose a new approach to address this problem --- detect when software developers are experiencing difficulty while they work on their programming tasks, and stop them before they can introduce bugs into the code. In this paper, we investigate a novel approach to classify the difficulty of code comprehension tasks using data from psycho-physiological sensors. We present the results of a study we conducted with 15 professional programmers to see how well an eye-tracker, an electrodermal activity sensor, and an electroencephalography sensor could be used to predict whether developers would find a task to be difficult. We can predict nominal task difficulty (easy/difficult) for a new developer with 64.99% precision and 64.58% recall, and for a new task with 84.38% precision and 69.79% recall. We can improve the Naive Bayes classifier's performance if we trained it on just the eye-tracking data over the entire dataset, or by using a sliding window data collection schema with a 55 second time window. Our work brings the community closer to a viable and reliable measure of task difficulty that could power the next generation of programming support tools.

Analyzing Programming Tasks

Begel, Andrew
BiometricsCS Education Proceedings of the First International Workshop for Eye Movements in Programming Education: Analyzing the Expert's Gaze. Joensuu, Finland. November 2013.

Abstract

In this position paper, I first describe the eyetracking patterns of the two participant videos I watched and coded. Next, I reflect on the methods and validity of manual coding and interpretation, and finally, I add my own thoughts on the utility of eyetracking data for understanding and helping programmers create and maintain software.

Have Agile Techniques been the Silver Bullet for Software Development at Microsoft?

Murphy, Brendan and Bird, Christian and Zimmermann, Thomas and Williams, Laurie and Nagappan, Nachiappan and Begel, Andrew
Agile MethodsHuman Aspects Proceedings of ACM / IEEE International Symposium on Empirical Software Engineering and Measurement. Baltimore, Maryland. October 2013.

Abstract

Background. The pressure to release high-quality, valuable software products at an increasingly faster rate is forcing software development organizations to adapt their development practices. Agile techniques began emerging in the mid-1990s in response to this pressure and to increased volatility of customer requirements and technical change. Theoretically, agile techniques seem to be the silver bullet for responding to these pressures on the software industry. Aims. This paper tracks the changing attitudes to agile adoption and techniques, within Microsoft, in one of the largest longitudinal surveys of its kind (2006-2012). Method. We collected the opinions of 1,969 agile and non-agile practitioners in five surveys over a six-year period. Results. The survey results reveal that despite intense market pressure, the growth of agile adoption at Microsoft is slower than would be expected. Additionally, no individual agile practice exhibited strong growth trends. We also found that while development practices of teams may be similar, some perceive and declare themselves to be following an agile methodology while others do not. Both agile and non-agile practitioners agree on the relative benefits and problem areas of agile techniques. Conclusions. We found no clear trends in practice adoption. Non-agile practitioners are less enamored of the benefits and more strongly in agreement with the problem areas. The ability for agile practices to be used by large-scale teams generally concerned all respondents, which may limit its future adoption.

2nd International Workshop on User Evaluations for Software Engineering Researchers

Begel, Andrew and Sadowski, Caitlin
WorkshopHuman Aspects Proceedings of the 2013 International Conference on Software Engineering. San Francisco, CA, USA. May 2013.

Abstract

We have met many software engineering researchers who would like to evaluate a tool or system they developed with real users, but do not know how to begin. In this second iteration of the USER workshop, attendees will collaboratively design, develop, and pilot plans for conducting user evaluations of their own tools and/or software engineering research projects. Attendees will gain practical experience with various user evaluation methods through scaffolded group exercises, panel discussions, and mentoring by a panel of user-focused software engineering researchers. Together, we will establish a community of like-minded researchers and developers to help one another improve our research and practice through user evaluation.

Deciphering the Story of Software Development Through Frequent Pattern Mining

Bettenburg, Nicolas and Begel, Andrew
Human Aspects Proceedings of the 2013 International Conference on Software Engineering. San Francisco, CA, USA. May 2013.

Abstract

Software teams record their work progress in task repositories which often require them to encode their activities in a set of edits to field values in a form-based user interface. When others read the tasks, they must decode the schema used to write the activities down. We interviewed four software teams and found out how they used the task repository fields to record their work activities. However, we also found that they had trouble interpreting task revisions that encoded for multiple activities at the same time. To assist engineers in decoding tasks, we developed a scalable method based on frequent pattern mining to identify patterns of frequently co-edited fields that each represent a conceptual work activity. We applied our method to our two years of our interviewee's task repositories and were able to abstract 83,000 field changes into just 27 patterns that cover 95% of the task revisions. We used the 27 patterns to render the teams' tasks in web-based English newsfeeds and evaluated them with the product teams. The team agreed with most of our patterns and English interpretations, but outlined a number of improvements that we will incorporate into future work.

App-Directed Learning: An Exploratory Study

Sillito, Jonathan and Begel, Andrew
CS EducationHuman Aspects Proceedings of 6th International Workshop on Cooperative and Human Aspects of Software Engineering. San Francisco, California. May 2013.

Abstract

Learning a new platform is a common, yet difficult task for software developers today. A range of resources, both official resources (i.e., those provided by the platform owner) and those provided by the wider developer community are available to help developers. To increase our understanding of the learning process and the resources developers use, we conducted an interview and diary study in which ten developers told us about their experience learning to develop Windows Phone applications. We report on a preliminary analysis of our data viewed through the lens of self-directed learning. Using this lens, we characterize the learning strategies of our subjects as app-directed, and describe some of the particular challenges our subjects faced due to this strategy.

Facilitating Enterprise Software Developer Communication with CARES

Guzzi, Anja and Begel, Andrew and Miller, Jessica K. and Nareddy, Krishna
Human AspectsTool Proceedings of 28th IEEE International Conference on Software Maintenance. Riva del Garda, Italy. September 2012.

Abstract

Enterprise software developers must regularly communicate with one another to obtain information and coordinate changes to legacy code, but find it cumbersome and complicated to determine the most relevant and expedient person to contact. This becomes especially difficult when the relevant person has transferred teams or changed their personal contact information since contributing to the project. We conducted a year-long series of surveys and interviews to help us learn how, why, and how often software developers discover and communicate with one another. In response to what we saw, we designed, deployed, and evaluated a domain-specific, IDE-embedded, photo-oriented, communication tool. We overcame a significant challenge found in long-lived projects: uniquely identifying individuals years after their contributions to the project. After deploying our tool, iteratively refining it, and deploying it again on a company-wide scale, most users reported that it simplified the process of finding and reaching out to other developers and offered them a sense of community with their colleagues, even if those colleagues did not currently work on their team. The lessons learned from our study and tool development should apply to other large, multi-team, legacy software projects.

Facilitating Communication Between Engineers with CARES. Demo.

Guzzi, Anja and Begel, Andrew
Human AspectsTool Proceedings of the 34th International Conference on Software Engineering. Zurich, Switzerland. June 2012.

Abstract

When software developers need to exchange information or coordinate work with colleagues on other teams, they are often faced with the challenge of finding the right person to communicate with. In this paper, we present our tool, called CARES (Colleagues and Relevant Engineers' Support), which is an integrated development environment-based (IDE) tool that enables engineers to easily discover and communicate with the people who have contributed to the source code. CARES has been deployed to 30 professional developers, and we interviewed 8 of them after 3 weeks of evaluation. They reported that CARES helped them to more quickly find, choose, and initiate contact with the most relevant and expedient person who could address their needs.

First International Workshop on User Evaluation for Software Engineering Researchers

Begel, Andrew and Sadowski, Caitlin
WorkshopHuman Aspects Zurich, Switzerland. May 2012.

Abstract

We have met many software engineering researchers who would like to evaluate a tool or system they developed with real users, but do not know how to begin. In this workshop, participants will interactively learn suitable usability methods through scaffolded group exercises, collaboratively develop and test plans for evaluating their projects, and construct a support network of like-minded researchers to help them achieve their goals.

Industrial Program Comprehension Challenge 2011: Archeology and Anthropology of Embedded Control Systems

Begel, Andrew and Quante, Jochen
Human Aspects Proceedings of the 19th International Conference on Program Comprehension. Kingston, Ontario, Canada. June 2011.

Abstract

The Industrial Program Comprehension Challenge is a two-year-old track of the International Conference on Program Comprehension that provides a venue for researchers and industrial practitioners to communicate about new research directions that can help address real world problems. This year, 2011, a scenario-based challenge was created to inspire researchers to apply the best "archaeological" techniques for understanding the complexity of industrial software, and foster appreciation for the delicate "anthropological" scenario which drives the behavior of the software engineers, management, and customers. Participants had two months to work on the challenge and submit write-ups of their solutions. Acceptable submissions were exhibited as posters, while the best solutions were presented during the Industrial Challenge conference session. This new challenge format gives researchers the opportunity to present their novel techniques, tools and ideas to the community.

Is Integration of Communication and Technical Instruction across the SE Curriculum a Viable Strategy for Improving the Real-World Communication Abilities of Software Engineering Graduates?

Gannod, Gerald C. and Anderson, Paul V. and Burge, Janet E. and Begel, Andrew
CS EducationPanel Proceedings of the 24th IEEE-CS Conference on Software Engineering Education and Training (CSEE&T). Honolulu, Hawaii. May 2011.

Abstract

Software engineering educators and trainers are acutely aware that software engineering graduates need strong real-world communication abilities. The National Science Foundation is supporting a three-year project in which industry professionals, CS/SE faculty, and communication-across-the-curriculum specialists are collaborating to develop curricula and teaching resources designed to improve communication abilities of CS/SE graduates by integrating communication instruction and assignments with the technical work in courses across the students' four years of study. Our panelists -- an industry practitioner, a CS/SE educator, and a communication specialist -- will describe what has been learned in the project's first half and invite comments, insights and advice from the audience.

Coordination in Large-scale Software Development: Helpful and Unhelpful Behaviors

Begel, Andrew and Nagappan, Nachiappan and Poile, Christopher and Layman, Lucas
Human Aspects Redmond, Washington. April 2011.

Abstract

Large-scale software development requires coordination within and between very large engineering teams which may be located in different buildings, on different company campuses, and in different time zones. At Microsoft Corporation, we studied a 3-year-old, 300-person software application team based in Redmond, WA to learn how they coordinate with three intra-organization, physically distributed dependencies: a platform library team also in Redmond; a team three time zones away in Boston, MA; and a team in Hyderabad, India. Thirty-one interviews with 26 team members revealed that coordination was most impacted by issues of communication, capacity and cooperation. Distributed teams faced additional challenges due to time zone and cultural differences between the team members. We support our findings with a survey of 775 engineers across Microsoft who described their experiences managing coordination in their own software products. We suggest new processes and tools to improve team coordination.

Novice Professional: Recent Graduates in a First Software Engineering Job

Begel, Andrew and Simon, Beth
Human AspectsCS Education Making software: what really works, and why we believe it. October 2010.

Abstract

Much is written about software engineering education - how to teach novice computer scientists the programming, design and testing skills they need to become professional software engineers. However, computer science students are not done with their education at graduation; it is really just the beginning. Newly hired engineers must learn to edit, debug, and create code on a deadline while learning to communicate and interact appropriately with a large team of colleagues. In this chapter, we explore the similarities and differences between these two educational experiences, by providing a detailed view of the novice experience of software developers in their first industry job.

From Program Comprehension to People Comprehension

Begel, Andrew
Human AspectsPanel Proceedings of the 18th International Conference on Program Comprehension. Braga, Portugal. June 2010.

Abstract

Large-scale software engineering requires many teams to collaborate together to create software products. The problems these teams suffer trying to coordinate their joint work can be addressed through tools inspired by social networking. Social networking tools help people to more easily discover and more efficiently maintain relationships than is feasible using one-to-one or face-to-face interactions. Applying these ideas to the software domain requires new kinds and combinations of software program and process analyses that overcome intrinsic limitations in the accuracy of the underlying data sources and the ambiguity inherent in human relationships.

Three Things Every CS Educator Should Know About Their Students' Future Careers in Software Development: Keynote Address

Begel, Andrew
Invited TalkCS Education Journal of Computing Sciences in Colleges, 25 (4), 125--125. April 2010.

Abstract

Computer science education is fundamentally about transitioning students from novices to experts. As students learn new hard and soft skills, and master them, they grow more confident in their abilities and interactions with others. We are pleased to see them become big fish in a small pond. But, when college graduates enter the software engineering workforce, just how well do they fare? In this talk, I'll show you three surprising challenges that we saw newly graduated Computer Science students overcome as they began careers in software development at Microsoft. With the adoption of some innovative pedagogical approaches in Computer Science education already being taught in universities around the world, I think we can ease the transition and better prepare students for positions in the software industry.

Not Seen and Not Heard: Onboarding Challenges in Newly Virtual Team

Hemphill, Libby and Begel, Andrew
Human Aspects Redmond, Washington. September 2009.

Abstract

Virtual teams, in which the members work from multiple locations, have become a common feature at many global organizations. In spite of this new reality, collocated teams experience difficulties in adapting their established processes and practices for a newly virtual working environment, greatly impeding their performance, productivity, and morale. In this paper, we present findings from a qualitative case study of five software teams that hired and onboarded their first remote team member. Our analyses focus on three underappreciated aspects of the virtual onboarding process: trying to learn team practices as the team changes them, building and maintaining social relationships with physically remote teammates, and evaluating and managing expectations of performance from afar. From the results of our analyses, we pose seven propositions about virtual onboarding that should be explored in future studies.

Coordination in Large-scale Software Teams

Begel, Andrew and Nagappan, Nachiappan and Poile, Christopher and Layman, Lucas
Human Aspects Proceedings of the Workshop on Cooperative and Human Aspects on Software Engineering. Vancouver, BC, Canada. May 2009.

Abstract

Large-scale software development requires coordination within and between very large engineering teams which may be located in different buildings, on different company campuses, and in different time zones. From a survey answered by 775 Microsoft software engineers, we learned how work was coordinated within and between teams and how engineers felt about their success at these tasks. The respondents revealed that the most common objects of coordination are schedules and features, not code or interfaces, and that more communication and personal contact worked better to make interactions between teams go more smoothly.

Three Things Every CS Educator Should Know About Their Students' Future Careers in Software Development

Begel, Andrew
Invited TalkCS Education Journal of Computing Sciences in Colleges, 24 (4), 143--143. April 2009.

Abstract

Computer science education is fundamentally about transitioning students from novices to experts. As students learn new hard and soft skills, and master them, they grow more confident in their abilities and interactions with others. We are pleased to see them become big fish in a small pond. But, when college graduates enter the software engineering workforce, just how well do they fare? In this talk, I'll show you three surprising challenges that we saw newly graduated Computer Science students overcome as they began careers in software development at Microsoft. With the adoption of some innovative pedagogical approaches in Computer Science education already being taught in universities around the world, I think we can ease the transition and better prepare students for positions in the software industry.

Pair Programming: What's In It for Me?

Begel, Andrew and Nagappan, Nachiappan
Human Aspects Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. Kaiserslautern, Germany. October 2008.

Abstract

Pair programming is a practice in which two programmers work collaboratively at one computer on the same design, algorithm, or code. Prior research on pair programming has primarily focused on its evaluation in academic settings. There has been limited evidence on the use, problems and benefits, partner selection, and the general perceptions towards pair programming in industrial settings. In this paper we report on a longitudinal evaluation of pair programming at Microsoft Corporation. We find from the results of a survey sent to a randomly selected 10% of engineers at Microsoft that 22% pair program or have pair programmed in the past. Using qualitative analysis, we performed a large-scale card sort to group the various benefits and problems of pair programming. The biggest perceived benefits of pair programming were the introduction of fewer bugs, spreading code understanding, and producing overall higher quality code. The top problems were cost-efficiency, (work time) scheduling problems, and personality conflicts. Most engineers preferred a partner who had complementary skills to their own, who was flexible and had good communication skills.

How Will You See My Greatness if You Can't See Me? Poster.

Hemphill, Libby and Begel, Andrew
Human AspectsPoster Proceedings of the ACM Conference Computer Supported Cooperative Work. San Diego, California. November 2008.

Abstract

Newly hired employees go through a ramp-up period of acclimating to their organization. This period, known as on- boarding, is often stressful and challenging for both the new hires and their managers. In globally distributed software development teams, the onboarding process may also be dis- tributed; new hires may be in completely different locations than their managers and teammates. We are conducting a qualitative study of new hires who work remotely from their software development teams. Our data indicate that these new hires are impacted by their struggle to get noticed by their managers and teams during their first few weeks on the job. In this poster, we offer evidence that remote new hires are frustrated by their diminished opportunities to demonstrate proficiency to their managers and are un- able to observe some important kinds of great work by their teammates.

Novice Software Developers, All over Again

Begel, Andrew and Simon, Beth
CS EducationHuman Aspects Proceedings of the Fourth International Workshop on Computing Education Research. Sydney, Australia. August 2008.

Abstract

Transitions from novice to expert often cause stress and anxiety and require specialized instruction and support to enact efficiently. While many studies have looked at novice computer science students, very little research has been conducted on professional novices. We conducted a two-month in-situ qualitative case study of new software developers in their first six months working at Microsoft. We shadowed them in all aspects of their jobs: coding, debugging, designing, and engaging with their team, and analyzed the types of tasks in which they engage. We can explain many of the behaviors revealed by our analyses if viewed through the lens of newcomer socialization from the field of organizational man-agement. This new perspective also enables us to better understand how current computer science pedagogy prepares students for jobs in the software industry. We consider the implications of this data and analysis for developing new processes for learning in both university and industrial settings to help accelerate the transition from novice to expert software developer.

Global Software Development: Who Does It?

Begel, Andrew and Nagappan, Nachiappan
Global Software DevelopmentHuman Aspects Proceedings of IEEE International Conference on Global Software Engineering. Bangalore, India. August 2008.

Abstract

In today's world, software development is increasingly spread across national and geographic boundaries. There is limited empirical evidence about the number and distribution of people in a large software company who have to deal with global software development (GSD). Is GSD restricted to a select few in a company? How many time zones do engineers have to deal with? Do managers have to deal with GSD more than individual engineers? What are the benefits and problems that engineers see with GSD? How have they tried to improve GSD coordination? These are interesting questions to be addressed in an empirical context. In this paper, we report on the results of a large-scale survey of software engineers at Microsoft Corporation. We found that a very high proportion of engineers are directly involved with GSD. In addition, more than 50% of the respondents regularly collaborate with people more than three time zones away. Engineers also report that communication difficulties around coordination are the most critical, yet difficult to solve issues with GSD.

Effecting Change: Coordination in Large-scale Software Development

Begel, Andrew
Human Aspects Proceedings of the International Workshop on Cooperative and Human Aspects of Software Engineering. Leipzig, Germany. May 2008.

Abstract

Large-scale software development requires coordination within and between very large engineering teams, each of which may be located in different locations and time zones. Numerous studies, and indeed, a whole conference (ICGSE), are dedicated to discovering the causes of problems with distributed development in the software industry. Microsoft has long had product teams too large to be considered co-located, even when sitting in neighboring buildings on the same campus. Recently, it has been expanding its engineering workforce into India and China, and our research is showing that Microsoft is encountering many of the coordination problems that go along with differences of location, time zone, and culture. As we go forward, our research has been changing from learning about the problem to experimenting with solutions. What are the best practices for improving coordination? Can they be applied to all software teams? How does one move past simple readings of research results towards effective intervention?

Mining Software Effort Data: Preliminary Analysis of Visual Studio Team System Data

Layman, Lucas and Nagappan, Nachiappan and Guckenheimer, Sam and Beehler, Jeff and Begel, Andrew
Human Aspects Proceedings of the 2008 International Working Conference on Mining Software Repositories. Leipzig, Germany. May 2008.

Abstract

In the software development process, scheduling and predictability are important components to delivering a product on time and within budget. Effort estimation artifacts offer a rich data set for improving scheduling accuracy and for understanding the development process. Effort estimation data for 55 features in the latest release of Visual Studio Team System (VSTS) were collected and analyzed for trends, patterns, and differences. Statistical analysis shows that actual estimation error was positively correlated with feature size, and that in-process metrics of estimation error were also correlated with the final estimation error. These findings suggest that smaller features can be estimated more accurately, and that in-process estimation error metrics can be provide a quantitative supplement to developer intuition regarding high-risk features during the development process.

Deep Intellisense: A Tool for Rehydrating Evaporated Information

Holmes, Reid and Begel, Andrew
Human AspectsTool Proceedings of the 2008 International Working Conference on Mining Software Repositories. Leipzig, Germany. May 2008.

Abstract

Software engineers working in large teams on large, long-lived code-bases have trouble understanding why the source code looks the way does. Often, they answer their questions by looking at past revisions of the source code, bug reports, code checkins, mailing list messages, and other documentation. This process of inquiry can be quite inefficient, especially when the answers they seek are located in isolated repositories accessed by multiple independent investigation tools. Prior mining approaches have focused on linking various data repositories together; in this paper we investigate techniques for displaying information extracted from the repositories in a way that helps developers to build a cohesive mental model of the rationale behind the code. After interviewing several developers and testers about how they investigate source code, we created a Visual Studio plugin called Deep Intellisense that summarizes and displays historical information about source code. We designed Deep Intellisense to address many of the hurdles engineers face with their current techniques, and help them spend less time gathering information and more time getting their work done.

Struggles of New College Graduates in Their First Software Development Job

Begel, Andrew and Simon, Beth
CS EducationHuman Aspects Proceedings of the 39th SIGCSE Technical Symposium on Computer Science Education. Portland, OR, USA. March 2008.

Abstract

How do new college graduates experience their first software development jobs? In what ways are they prepared by their educational experiences, and in what ways do they struggle to be productive in their new positions? We report on a "fly-on-the-wall" observational study of eight recent college graduates in their first six months of a software development position at Microsoft Corporation. After a total of 85 hours of on-the-job observation, we report on the common abilities evidenced by new software developers including how to program, how to write design specifications, and evidence of persistence strategies for problem-solving. We also classify some of the common ways new software developers were observed getting stuck: communication, collaboration, technical, cognition, and orientation. We report on some common misconceptions of new developers which often frustrate them and hinder them in their jobs, and conclude with recommendations to align Computer Science curricula with the observed needs of new professional developers.

Codifier: A Programmer-Centric Search User Interface

Begel, Andrew
Human AspectsTool Proceedings of Workshop on Human-Computer Interaction and Information Retrieval. Cambridge, Massachusetts. October 2007.

Abstract

Search tools have transformed knowledge discovery by exposing information from previously hidden re-positories to the workers who need it. Search engines like Google and Live.com provide search capabilities via a simple one-line text query box, and present results in a paged HTML list. When the repository be-ing searched contains structured information with extractable metadata (e.g. program source code), it can be advantageous to index the metadata and use it to enable queries that are more task-centric and suitable for an domain-specific audience. Codifier is a programmer-centric search user interface that enables software developers to ask domain-specific questions related to programming languages and software.

Usage and Perceptions of Agile Software Development in an Industrial Context: An Exploratory Study

Begel, Andrew and Nagappan, Nachiappan
Agile MethodsHuman Aspects Proceedings of the 1st International Symposium on Empirical Software Engineering and Measurement, 2007. Madrid, Spain. September 2007.

Abstract

Agile development methodologies have been gaining acceptance in the mainstream software development community. While there are numerous studies of agile development in academic and educational settings, there has been little detailed reporting of the usage, penetration and success of agile methodologies in traditional, professional software development organizations. We report on the results of an empirical study conducted at Microsoft to learn about agile development and its perception by people in development, testing, and management. We found that one-third of the study respondents use agile methodologies to varying degrees, and most view it favorably due to improved communication between team members, quick releases and the increased flexibility of agile designs. The scrum variant of agile methodologies is by far the most popular at Microsoft. Our findings also indicate that developers are most worried about scaling agile to larger projects (greater than twenty members), attending too many meetings and the coordinating agile and non-agile teams.

UW/MSR Summer Institute on the Human Side of Software Development

Eds. DeLine, Robert and Venolia, Gina and Begel, Andrew and Notkin, David and Hendry, David
WorkshopHuman Aspects Stevenson, Washington. August 2007.

Abstract

Each summer, the University of Washington and Microsoft Research jointly host a workshop on a cross-disciplinary topic to bring together researchers who share a common interest but would not likely meet one another in their normal travels. You can find a description of past summer institutes at: http://www.cs.washington.edu/mssi/. This year's workshop focuses on software development as a human activity for individuals, teams and organizations. The workshop will bring together roughly 50 participants from academia, industry and government who bring perspectives from diverse disciplines, including software engineering, human-computer interaction, computer-supported collaborative work, psychology, and organizational behavior. Because existing research in this area is so wide-spread, our main goal is community building. We hope this workshop will form a large step toward establishing this problem area as an established field of research. The meeting will include presentations, including keynotes from each field, panels and breakout sessions, with planned flexibility to allow us to tailor to the participants' goals.

End User Programming for Scientists: Modeling Complex Systems

Begel, Andrew
Human Aspects Dagstuhl Seminary Proceedings on End-User Software Engineering. Dagstuhl, Germany. February 2007.

Abstract

Towards the end of the 20th century, a paradigm shift took place in many scientific labs. Scientists embarked on a new form of scientific inquiry seeking to understand the behavior of complex adaptive systems that increasingly defied traditional reductive analysis. By combining experimental methodology with computer-based simulation tools, scientists gain greater understanding of the behavior of systems such as forest ecologies, global economies, climate modeling, and beach erosion. This improved understanding is already being used to influence policy in critical areas that will affect our nation's future, and the world's.

Help, I Need Somebody! Poster.

Begel, Andrew
Human AspectsPoster Proceedings of Workshop on Supporting the Social Side of Large-Scale Software Development. Banff, Alberta, Canada. November 2006.

Abstract

Information discovery is a very difficult and frustrating aspect of software development. Novice developers are often assigned a mentor who preemptively provides answers and advice without requiring the novice to explicitly ask for help. A similar situation occurs among expert developers in radically collocated settings. The close proximity enhances communication between all members of a group, providing needed information, often preemptively due to ambient awareness of other developers. In this paper, we propose a mechanism to extend this desirable property of preemptive mentoring to developers in more traditional software engineering environments. The proposed system will infer when and how a developer becomes blocked looking for information, and notify an appropriate expert to come to his aid. We believe that this preemptive help will lower developer frustration and enhance diffusion of expert knowledge throughout an organization.

An Assessment of a Speech-Based Programming Environment

Begel, Andrew and Graham, Susan L.
Programming LanguagesHuman AspectsTool Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing. Brighton, England, United Kingdom. September 2006.

Abstract

Programmers who suffer from repetitive stress injuries find it difficult to program by typing. Speech interfaces can reduce the amount of typing, but existing programming-by-voice tools make it awkward for programmers to enter and edit program text. We used a human-centric approach to address these problems. We first studied how programmers verbalize code, and found that spoken programs contain lexical, syntactic and semantic ambiguities that do not appear in written programs. Using the results from this study, we designed Spoken Java, a syntactically similar, yet semantically identical variant of Java that is easier to speak. We built an Eclipse IDE plug-in called SPEED (for speech editor) to support the combination of Spoken Java and an associated command language. In this paper, we report the results of the first study ever of any working programming-by-voice system. Our evaluation with expert Java developers showed that most developers had little trouble learning to use the system via spoken commands, but were reluctant to speak literal code out loud. As expected, programmers found programming by voice to be slower than typing

Cognitive Perspectives on the Role of Naming in Computer Programs

Liblit, Ben and Begel, Andrew and Sweetser, Eve
Human Aspects Proceedings of the 18th Annual Psychology of Programming Workshop. Brighton, England, United Kingdom. September 2006.

Abstract

Programming a computer is a complex, cognitively rich process. This paper examines ways in which human cognition is reflected in the text of computer programs. We concentrate on naming: the assignment of identifying labels to programmatic constructs. Naming is arbitrary, yet programmers do not select names arbitrarily. Rather, programmers choose and use names in regular, systematic ways that reflect deep cognitive and linguistic influences. This, in turn, allows names to carry semantic cues that aid in program understanding and support the larger software development process.

XGLR - An Algorithm for Ambiguity in Programming Languages.

Begel, Andrew and Graham, Susan L.
Programming Languages Science of Computer Programming, 61 (3), 211-227. August 2006.

Abstract

Automatically generated lexers and parsers for programming languages have a long history. Although they are well suited for many languages, many widely used generators, among them Flex and Bison, fail to handle input stream ambiguities that arise in embedded languages, in legacy languages, and in programming by voice. We have developed Blender, a combined lexer and parser generator that enables designers to describe many classes of embedded languages and to handle ambiguities in spoken input and in legacy languages. We have enhanced the incremental lexing and parsing algorithms in our Harmonia framework to analyse lexical, syntactic and semantic ambiguities. The combination of better language description and enhanced analysis provides a powerful platform on which to build the next generation of language analysis tools.

Kinesthetic Learning in the Classroom

Begel, Andrew and Bates, Rebecca and Wolfman, Steven A.
WorkshopCS Education Dallas, Texas. March 2006.

Spoken Language Support for Software Development

Begel, Andrew
Human AspectsProgramming LanguagesDissertation Proceedings of the Doctoral Consortium of the 33rd SIGCSE Technical Symposium on Computer Science Education. Berkeley, California. December 2005.

Abstract

Programmers who suffer from repetitive stress injuries find it difficult to program by typing. Speech interfaces can reduce the amount of typing, but existing programming-by-voice techniques make it awkward for programmers to enter and edit program text. We used a human-centric approach to address these problems. We first studied how programmers verbalize code, and found that spoken programs contain lexical, syntactic and semantic ambiguities that do not appear in written programs. Using the results from this study, we designed Spoken Java, a semantically identical variant of Java that is easier to speak. Inspired by a study of how voice recognition users navigate through documents, we developed a novel program navigation technique that can quickly take a software developer to a desired program position. Spoken Java is analyzed by extending a conventional Java programming language analysis engine written in our Harmonia program analysis framework. Our new XGLR parsing framework extends GLR parsing to process the input stream ambiguities that arise from spoken programs (and from embedded languages). XGLR parses Spoken Java utterances into their many possible interpretations. To semantically analyze these interpretations and discover which ones are legal, we implemented and extended the Inheritance Graph, a semantic analysis formalism which supports constant-time access to type and use-definition information for all names defined in a program. The legal interpretations are the ones most likely to be correct, and can be presented to the programmer for confirmation. We built an Eclipse IDE plugin called SPEED (for SPEech EDitor) to support the combination of Spoken Java, an associated command language, and a structure-based editing model called Shorthand. Our evaluation of this software with expert Java developers showed that most developers had little trouble learning to use the system, but found it slower than typing. Although programming-by-voice is still in its infancy, it has already proved to be a viable alternative to typing for those who rely on voice recognition to use a computer. In addition, by providing an alternative means of programming a computer, we can learn more about how programmers communicate about code.

Spoken Programs

Begel, Andrew and Graham, Susan L.
Programming LanguagesHuman AspectsTool Proceedings of IEEE Symposium on Visual Languages and Human-Centric Computing. Dallas, Texas. September 2005.

Abstract

Programmers who suffer from repetitive stress injuries find it difficult to spend long amounts of time typing code. Speech interfaces can help developers reduce their dependence on typing. However, existing programming by voice techniques make it awkward for programmers to enter and edit program text. To design a better alternative, we conducted a study to learn how software developers naturally verbalize programs. We found that spoken programs are different from written programs in ways similar to the differences between spoken and written English; spoken programs contain lexical, syntactic and semantic ambiguities that do not appear in written programs. Using the results from this study, we designed Spoken Java, a semantically identical variant of Java that is easier to say out loud. Using Spoken Java, software developers can speak more naturally by verbalizing their program code as if they were reading it out loud. Spoken Java is analyzed by extending a conventional Java programming language analysis engine written in our Harmonia program analysis framework to support the kinds of ambiguities that arise from speech.

Kinesthetic Learning in the Classroom

Begel, Andrew
WorkshopCS Education Philadelphia, Pennsylvania. June 2005.

Programming by Voice: A Domain-Specific Application of Speech Recognition

Begel, Andrew
Human AspectsProgramming Languages Proceedings of AVIOS Speech Technology Symposium - SpeechTek West. San Francisco, California. March 2005.

Abstract

Programming environments can create frustrating barriers for the growing numbers of software developers that suffer from repetitive strain injuries (RSI) and related disabilities that make typing difficult or impossible. Not only is the software development process comprised of fairly text-intensive activities like program composition, editing and navigation, but the tools used for programming are also operated textually. This results in a work environment for programmers in which long hours of RSI-exacerbating typing are unavoidable.

Kinesthetic Learning in the Classroom

Begel, Andrew and Garcia, Daniel D. and Wolfman, Steven A.
WorkshopCS Education St. Louis, Missouri. March 2005.

Kinesthetic Learning in the Classroom

Begel, Andrew
WorkshopCS Education St. Louis, Missouri. February 2005.

StarLogo: A Programmable Complex Systems Modeling Environment for Students and Teachers

Begel, Andrew and Klopfer, Eric
CS EducationProgramming Languages Artificial Life Models in Software. 2005.

Transformational Generation of Language Plug-ins in the Harmonia Framework

Begel, Andrew and Boshernitsan, Marat and Graham, Susan L.
Programming Languages Berkeley, California. January 2005.

Abstract

The Harmonia framework provides an infrastructure for building language-aware interactive programming tools. Harmonia supports many languages through language plug-ins, which are dynamically-loadable system extensions generated from lexical, syntactic, and semantic descriptions. In this report, we describe our approach to generating Harmonia language plug-ins from a variety of domain-specific description languages. We present the process of configuring plug-in analysis components, the transformations for high-level syntactic and semantic descriptions, and the optimizations for generated code. This largely adhoc process makes our generation techniques expensive to create and difficult to maintain. We propose a new component-based architecture based on transformational generation, present its benefits, and outline several research directions that still need to be addressed by the generative programming community.

StarLogo TNG. An introduction to game development

Klopfer, Eric and Begel, Andrew
CS EducationProgramming Languages Journal of E-Learning, undefined (undefined), undefined. 2005.

Abstract

The science of developing computer programs offers a rich educational experience that can help students gain fluency with information technology. Unfortunately, while computers have become commonplace in schools, the practice of teaching programming is being squeezed out of high school and middle school curricula. We believe that programming should be reintroduced to students, and that this can be done by focusing on video game construction, a compelling subject area for many students. Given the current expertise required to create a modern video game, new tools are needed to make this experience accessible to students. We have developed StarLogo TNG, a visual programming- and 3Dbased environment that enables students to easily program their own games. It uses graphical programming to ease the learning curve for programming, and 3D graphics to make the developed games more realistic. An initial pilot study has shown that these innovations appeal to students, and in particular appeal to girls.

Language Analysis and Tools for Ambiguous Input Streams

Begel, Andrew and Graham, Susan L.
Programming Languages Electronic Notes in Theoretical Computer Science, 110 (0), 75-96. December 2004.

Abstract

Automatically generated lexers and parsers for programming languages have a long history. Although they are well-suited for many languages, many widely-used generators, among them Flex and Bison, fail to handle input stream ambiguities that arise in embedded languages, in legacy languages, and in programming by voice. We have developed Blender, a combined lexer and parser generator that enables designers to describe many classes of embedded languages and to handle ambiguities in spoken input and in legacy languages. We have enhanced the incremental lexing and parsing algorithms in our Harmonia framework to analyze lexical, syntactic and semantic ambiguities. The combination of better language description and enhanced analysis provides a powerful platform on which to build the next generation of language analysis tools.

Spoken Language Support for Software Development

Begel, Andrew
Programming LanguagesHuman AspectsTool Proceedings of the IEEE Symposium on Visual Languages and Human Centric Computing. Rome, Italy. September 2004.

Abstract

Software development environments have changed little since their origins as low-level text editors. Programmers with repetitive strain injuries and other motor disabilities can find these environments difficult or impossible to use due to their emphasis on typing. Our research adapts voice recognition to the software development process, both to mitigate this difficulty and to provide insight into natural forms of high-level interaction. Our contribution is to use program analysis to interpret speech as code, thereby enabling the creation of a program editor that supports voice-based programming. We have created spoken Java, a variant of Java which is easier to verbalize than its traditional typewritten form, and an associated spoken command language to manipulate code. We are conducting user studies to understand the cognitive effects of spoken programming, as well as to inform the design of the language and editor.

Managing Duplicated Code with Linked Editing

Toomim, Michael and Begel, Andrew and Graham, Susan L.
Human AspectsTool Proceedings of IEEE Symposium on Visual Languages and Human Centric Computing. Rome, Italy. September 2004.

Abstract

We present linked editing, a novel, lightweight editor-based technique for managing duplicated source code. Linked editing is implemented in a prototype editor called Codelink. We argue that the use of programming abstractions like functions and macros - the traditional solution to duplicated code - has inherent cognitive costs, leading programmers to chronically copy and paste code instead. Our user study compares functional abstraction with linked editing and shows that linked editing can give the benefits of abstraction with orders of magnitude decrease in programming time

Programming Revisited: The Educational Value of Computer Programming

Klopfer, Eric and Resnick, Mitchel and Maloney, John and Silverman, Brian and diSessa, Andrea and Begel, Andrew and Hancock, Chris
CS EducationPanelProgramming Languages Proceedings of the 6th International Conference on Learning Sciences. Santa Monica, California. June 2004.

Abstract

This panel will address pedagogical needs for revisiting the role of computer programming for student learning. We will explore advances in programming platforms that enable students to create compelling projects with new technologies, and discuss the affordances of these new initiatives. We will address how these tools and techniques can be integrated into the curriculum of the classroom as well as informal learning environments.

Kinesthetic Learning in the Classroom. Special Session.

Begel, Andrew and Garcia, Daniel D. and Wolfman, Steven A.
CS Education Proceedings of the 35th SIGCSE Technical Symposium on Computer Science Education. Norfolk, Virginia, USA. March 2004.

Abstract

We propose a special session focusing on kinesthetic learning activities, i.e., physically engaging classroom exercises. These might, for example, involve throwing a frisbee around the classroom to represent transfer of control in a procedure call, or simulating polygon scan conversion with rope for edges and students for pixels. The session will begin with a brief kinesthetic learning activity to motivate the value of these activities. We will follow with a variety of examples, and discuss how to deploy these in a classroom. In the middle of the session, the audience will divide into facilitated groups to design their own activities. Finally, we will all mingle to share and discuss the results. We will set up a public web forum for continued discussion and generation of new ideas.

Eclipse + Harmonia: Language-Based Tools for the Programmer. Poster.

Graham, Susan L. and Begel, Andrew and Boshernitsan, Marat
Programming LanguagesToolPoster Proceedings of Workshop on Eclipse Technology Exchange. Anaheim, California. October 2003.

Abstract

Harmonia: An extensible framework for interactive, language-aware programming tools.

StarLogo under the Hood and in the Classroom

Klopfer, Eric and Begel, Andrew
CS EducationProgramming Languages Kybernetes, 32 (1/2), 15-37. January 2003.

Abstract

StarLogo is a computer modeling tool that empowers students to understand the world through the design and creation of complex systems models. StarLogo enables students to program software creatures to interact with one another and their environment, and study the emergent patterns from these interactions. Building an easy-to-understand, yet powerful tool for students required a great deal of thought about the design of the programming language, environment, and its implementation. The salient features are StarLogo's great degree of transparency (the capability to see how a simulation is built), its support to let students create their own models (not just use models built by others), its efficient implementation (supporting simulations with thousands of independently executing creatures on desktop computers), and its flexible and simple user interface (which enables students to interact dynamically with their simulation during model testing and validation). The resulting platform provides a uniquely accessible tool that enables students to become full-fledged practitioners of modeling. In addition, we describe the powerful insights and deep scientific understanding that students have developed through the use of StarLogo.

An Analysis of VI Architecture Primitives in Support of Parallel and Distributed Communication

Begel, Andrew and Buonadonna, Philip and Culler, David E. and Gay, David
Programming Languages Concurrency and Computation: Practice and Experience, 14 (1), 55-76. January 2002.

Abstract

We present the results of a detailed study of the Virtual Interface (VI) paradigm as a communication foundation for a distributed computing environment. Using Active Messages and the Split-C global memory model, we analyze the inherent costs of using VI primitives to implement these highlevel communication abstractions. We demonstrate a minimum mapping cost (i.e. the host processing required to map one abstraction to a lower abstraction) of 5.4 $micro$sec for both Active Messages and Split-C using 4-way 550 MHz Pentium III SMPs and the Myrinet network. We break down this cost to use of individual VI primitives in supporting flow control, buffer management and event processing and identify the completion queue as the source of the highest overhead. Bulk transfer performance plateaus at 44 Mbytes/sec for both implementations due to the addition of fragmentation requirements. Based on this analysis, we present the implications for the VI successor, Infiniband.

BPF+: Exploiting Global Data-flow Optimization in a Generalized Packet Filter Architecture

Begel, Andrew and McCanne, Steven and Graham, Susan L.
Programming LanguagesTool Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication. Cambridge, Massachusetts, USA. September 1999.

Abstract

A packet filter is a programmable selection criterion for classifying or selecting packets from a packet stream in a generic, reusable fashion. Previous work on packet filters falls roughly into two categories, namely those efforts that investigate flexible and extensible filter abstractions but sacrifice performance, and those that focus on low-level, optimized filtering representations but sacrifice flexibility. Applications like network monitoring and intrusion detection, however, require both high-level expressiveness and raw performance. In this paper, we propose a fully general packet filter framework that affords both a high degree of flexibility and good performance. In our framework, a packet filter is expressed in a high-level language that is compiled into a highly efficient native implementation. The optimization phase of the compiler uses a flowgraph set relation called edge dominators and the novel application of an optimization technique that we call "redundant predicate elimination," in which we interleave partial redundancy elimination, predicate assertion propagation, and flowgraph edge elimination to carry out the filter predicate optimization. Our resulting packet-filtering framework, which we call BPF+, derives from the BSD packet filter (BPF), and includes a filter program translator, a byte code optimizer, a byte code safety verifier to allow code to migrate across protection boundaries, and a just-in-time assembler to convert byte codes to efficient native code. Despite the high degree of flexibility afforded by our generalized framework, our performance measurements show that our system achieves performance comparable to state-of-the-art packet filter architectures and better than hand-coded filters written in C.

More Flexible Data Types

Spreitzer, Michael and Begel, Andrew
Programming Languages Proceedings of 8th International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises. Stanford, California. June 1999.

Abstract

XML can play several roles in a distributed object system. In particular, data can be serialized in XML-based formats. XML-encoded data can be more self-describing than data encoded in many more traditional ways, which facilitates the kind of decentralized protocol evolution seen in Internet-scale development: XML's explicit "tagging and bagging" helps keep extensions straight. However, today's common distributed object systems have type systems that are not flexible enough to describe such data. We suggest a way to make more flexible data types; this improves distributed object systems in general, and is critical to realizing XML's full potential. This approach has: (1) typing judgements based on type structure instead of type identity, (2) extensible record types with optional fields, (3) coarse record types, for which extension is compatible with subtyping, and (4) non-ignorable fields in record values

Bongo: A Kids' Programming Environment for Creating Video Games on the Web

Begel, Andrew
CS EducationProgramming LanguagesDissertation Cambridge, Massachusetts. June 1997.

Abstract

In recent years, a growing number of researchers and educators have argued that design projects provide rich opportunities for learning. To support this type of learning, educational researchers have developed computational environments (such as Logo and LEGO/Logo) that enable children to design their own animated stories, simulations, and even robotic constructions. The rise of the Internet presents an opportunity for new types of design activities, enabling kids to create projects that reach a larger audience than ever before. Some kids are beginning to create their own home pages on the World Wide Web. With the Java programming language, people can now create increasingly sophisticated Web pages with dynamic, interactive content. But Java is intended for expert programmers, not children. This thesis describes a new programming language and environment called Bongo that brings the power of Java to kids. In particular, it discusses a construction kit written in Bongo that enables kids to build their own video games, and to share those games with others on the Web.

LogoBlocks: A Graphical Programming Language for Interacting with the World

Begel, Andrew
CS EducationProgramming LanguagesDissertation Cambridge, Massachusetts. June 1996.

Abstract

LogoBlocks is a graphical programming language for the Programmable Brick, developed at the Epistemology and Learning Group in the MIT Media Lab. The Programmable Brick is a small handheld computer that a person can attach to a LEGO creation to control motors and read inputs from sensors. LogoBlocks is intended to be an alternative language to BrickLogo, which is a variant of Logo developed for use with the Programmable Brick. Graphical programming has some significant advantages over textual programming especially in providing visual cues for younger programmers. LogoBlocks attempts to concretize some of these ideas and make the process of building active LEGO creations easier and more intuitive for young children.

Teaching Experience

  • Spring 2025

    Software Engineering for Startups

    CMU

    Co-taught with Fraser Brown

  • Summer 2024

    Preparing Autistic Students for the AI Workforce

    CMU, UVA, Penn State

    Co-taught with Somayeh Asadi, Rick Kubina, Taniya Mishra, Matthew Boyer, Jiwoong Jang, Rory McDaniel, and Aidan San

  • Fall 2024

    Celebrating Accessibility

    CMU

    Co-taught with Patrick Carrington

  • Spring 2023

    Software Engineering for Startups

    CMU

    Co-taught with Michael Hilton

  • Summer 2022

    Educating Autistic Software Engineers Coding Camp

    Clemson University

    Co-taught with Paige Rodeghero

  • Summer 2021

    Computer Game Coding Camp

    Clemson University

    Co-taught with Paige Rodeghero

  • Summer 2020

    Clemson Game Coding Camp

    Clemson University

    Co-taught with Paige Rodeghero

  • Spring 2018

    INFO 461: Cooperative Software Development

    University of Washington, Information School

    Course Instructor

  • Winter 2013

    INFO 461: Cooperative Software Development

    University of Washington, Information School

    Course Instructor

  • Spring 2001

    CS301: Teaching Techniques for Computer Science

    University of California at Berkeley

    Co-taught with Daniel D. Garcia

  • Spring 2000

    CS164: Introduction to Compilers

    University of California at Berkeley

    Graduate Student Instructor. Course taught by Alexander Aiken and George Necula

  • Fall 1997

    CS61a: Structure and Interpretation of Computer Programs

    University of California at Berkeley

    Graduate Student Instructor. Course taught by Brian Harvey

Contact and Meet Me

I would be happy to speak with you about my research, CMU, or any questions you have about your own work.

CMU Office

Please come my office at TCS 441.

TCS Hall is at 4665 Forbes Ave. Pittsburgh, PA 15213.