Crowdsourcing Documentation in Software Engineering

Preview:

DESCRIPTION

Presented at ICSE 2014 Workshop on Crowdsourcing in Software Engineering June 2, 2014, Hyderabad India.

Citation preview

Crowdsourcing Documentation in Software Engineering

Margaret-Anne (Peggy) Storey ICSE 2014 1st International Workshop on Crowdsourcing in Software Engineering

Christoph Treude Brendan Cleary Fernando Figueira Filho Jamie Starke Gargi Bougie Peter Rigby Lars Grammel Leif Singer Laura MacLeod Daniel German Alexey Zagalsky

Chris Parnin, Georgia Tech Ohad Barzilay, Tel-Aviv University, Israel Arie van Deursen, TU Delft, the Netherlands Li-Te Cheng, IBM Research Ian Bull, Eclipsesource

Acknowledgements

“Documentation is the castor oil of software development”

Gerald Weinberg, Psychology of Computer Programming 1975

Documentation to capture…

Requirements Architecture Features, implementation Scenarios of use Examples of use Testing Decisions And more?

Created by…

Developers, contributors

Documenters Automatically

generated Users The crowd!

Designed for…

End users Client developers Contributors

Documentation rationale…

To replace communication

To specify a contract with partners To provide organizational memory To reflect To seek feedback

For the public good! [Wasko et al.]

Documentation formats…

Formal documentation (hierarchically structured)

Technical articles Books Self documenting code Source code comments Forums Email lists Usenet

Issues, bug tracking Archived chats Wikis Blog posts, microblogs Tagging Stackoverflow Videos, podcasts Community portals

(aggregate channels)

Documentation challenges… Navigability, discoverability Audience and “fit for purpose” Boring prose Consistent use of terminology Staying current Costly, slow Explicit versus tacit knowledge Lack of good examples

Crowdsourcing…

“…obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers… the work comes from an undefined public rather than being commissioned from a specific, named group…

Explicit crowdsourcing lets users work together to evaluate, share and build different specific tasks, while implicit crowdsourcing means that users solve a problem as a side effect of something else they are doing.” [Wikipedia, June 1, 2014]

Community versus crowd contributions?

Individual or team contributions (e.g. design documents, podcasts)

Community contributions: created by a few (e.g. translation efforts)

Crowdsourcing contributions: many small contributions that add value (e.g. views, likes, comments, tags, votes)

Social production [Yochai Benkler] Industrial revolution, high costs to access broadcast media

Low cost distributed small contributions at scale Not just turning levers but adding wisdom, creativity

Not a fad! Critical long term shift caused by the internet

Social media as a disruptive force: an enabler for crowdsourcing

Enhancing the participatory culture in software development and in software documentation

Storey, M.-A., L. Singer, F. Figueira Filho, B. Cleary and A. Zagalsky, The (R)evolutionary Role of Social Media in Software Engineering, ICSE 2014 Future of Software Engineering Track), Hyderabad, 2014.

Social Media Channels for Software Documentation

Community Portal

Tagging

Microblogging Question &

Answer Websites

Videos, podcasts

Blogging

Wikis

Outline of the rest of this talk

Some insights on how social media channels can support “crowdsourced” documentation in software development

Discussion

Community Portals

Tagging

MicroBlogging Question &

Answer Websites

Videos, podcasts

Blogging

Wikis

Wikis

Wikis for documenting Software

Wikis and software documentation

Used extensively (requirements, design, planning), integrated with many tools

Some shortcomings: lack of authoritativeness [Dagenais and Robillard FSE 2010]

Designed by Ward Cunningham in 1994

Community Portals

Question & Answer Websites

Videos, podcasts

Tagging

Wikis

MicroBlogging

Blogging

Social Tagging

How does tagging help with crowdsourced software documentation?

TagSEA: Tagging Waypoints in source code and gathering into Tours

M.-A. Storey, J. Ryall, J. Singer, D. Myers, L.-T. Cheng, M. Muller, 2009. How Software Developers Use Tagging to Support Reminding and Refinding. IEEE Transactions on Software Engineering (TSE), 2009.

Tagging in Studied introduction and adoption of tags by several teams for work items

C. Treude and M.-A. Storey. Work Item Tagging: Communicating Concerns in Collaborative Software Development. In IEEE Transactions on Software Engineering 38, 1 (January/February 2012). pp. 19-34.

Tagging in

Findings: – Categorization (cross cutting concerns, see also

Martin Robillard’s Feat tool) – Organization – Finding and refinding

ConcernLines

Treude, C., and M.-A. Storey, Concernlines: A timeline view of co-occurring concerns, formal research demonstration, IEEE ICSE’09.

Question & Answer Websites

Tagging

MicroBlogging

Community Portals

Videos, podcasts

Wikis

Blogging

Microblogging

Why do developers tweet?

Microblogging Software engineers tweet actively (share) facts about

software engineering topics and technology

G. Bougie, J. Starke, M.-A. Storey and D. German. Towards Understanding Twitter Use in Software Engineering: Preliminary Findings Ongoing Challenges and Future QuestionsIn Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering. 2011.

Survey/Interviews/Survey

Findings: – Awareness – Learning – Relationships

“It was evolving way faster than I was able to keep up with it. And the only way to keep up was to follow some Node.js people on Twitter.”

Leif Singer, Fernando Figueira Filho, Margaret-Anne Storey. Software Engineering at the Speed of Light: How Developers Stay Current Using Twitter ICSE 2014.

Question & Answer Websites

Tagging

MicroBlogging

Blogging

Community Portal

Videos, podcasts

Wikis

Blogging

Why do developers blog?

Blogging Determining requirements through blogs [Park and Maurer, CHASE 2009]

How developers blog: high-level concept discussion and requirements

[Pagano and Maalej, MSR 2011]

Blogs play a role in documenting APIs [Treude and Parnin, Web2SE 2011]

Is there potential to increase the size of the Blogging crowd for software documentation?

Question & Answer Websites

Tagging

MicroBlogging

Blogging

\

Community Portal

Videos, podcasts

Wikis

Question and Answer Websites

What role do Question and Answer websites play in documentation?

Over 92% of the questions on Stackoverflow are answered, and for those 92% the median answer time is 11 minutes

L. Mamykina, B. Manoim, M. Mittal, G. Hripcsak, and B. Hartmann. Design lessons from the fastest q&a site in the west. CHI 2011.

Stackoverflow

How-to questions prevalent, and used frequently by novices

C. Treude, O. Barzilay and M.-A. Storey. How do Programmers Ask and Answer Questions on the Web? NIER/ICSE 2011.

Linking Stackoverflow data with API usage

C. Parnin, C. Treude, L. Grammel and M.-A. Storey. Crowd Documentation: Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow”. Under submission, blogged (50,000 hits) at http://blog.ninlabs.com/2012/05/crowd-documentation/ May 2012.

Stackoverflow as Crowd Documentation

Coverage of API documentation: 77% of the Java API classes & 87% of Android API classes

Speed of coverage:

Impact on documentation tools? Automatically generating documentation Visualizing crowd documentation

http://latest-print.crowd-documentation.appspot.com/?api=android

Community Portals,

Question & Answer Websites

Videos, podcasts

Tagging

Wikis

MicroBlogging

Blogging

How do Developers use YouTube to Share Knowledge?

Videos, podcasts

44

Developer motivations?

Documentation! But also …

Reputation: Improves their online persona

Dedication to helping others “What I wish I had known when I started”

Efficiency “Throw it up on the internet and forget about it”

http://lmacleod.com/

Implications Many projects use videos to support documentation

and onboarding (e.g. MSDN) so…

How can they be improved for the recipient? How effective are videos at sharing tacit knowledge? Tool enhancements? Integration with IDE?

[e.g. Tours]

Cheng, L.-T., M. Desmond and M.-A. Storey, “Presentations by Programmers for Programmers”, ICSE 2007, IEEE 29th International Conference on Software Engineering.

Is this crowdsourcing? Are code walkthroughs on YouTube effective?

How much do the social features matter?

A social platform for crowd input for video documentation?

Question & Answer Websites

Tagging

MicroBlogging/Blogging

Community Portal

Videos, podcasts

Blogging

Wikis

Community portals

Stores code and project resources Provides version control Hosts web pages Connects people Links to communication tools Records interactions

C. Treude and M.-A. Storey. Effective Communication of Software Development Knowledge Through Community Portals. ESEC/FSE ’11.

Implications of different media Content on wikis is often stale, but useful for

posting information quickly

Blog posts create more buzz or fanfare

Official product documentation is trusted (review it carefully or rely on the crowd?)

Have an updating process (or crowdsource it?)

Have mechanisms to solicit feedback (e.g. commenting, blog posts, voting)

Social Media Channels to support Software Documentation

Community Portal

Tagging

Microblogging Question &

Answer Websites

Videos, podcasts

Blogging

Wikis

Discussion

Documentation challenges revisited

Recommenders to aid in discoverability Keeping up: leverage the crowd Incentive: participatory culture Video and podcasts for tacit knowledge Mining of social media can point to code

examples (implicit mechanism)

Discussion points

When does a community become a crowd? Gaps and nichification? Incentives? Dynamics? Study other portals, hubs? Do these mechanisms translate to industry?

What do you see as challenges, opportunities for involving the crowd?

http://www.thechiselgroup.org http://margaretannestorey.wordpress.com/

@thechiselgroup, @margaretstorey mstorey@uvic.ca

Funded by NSERC/DRDC/IBM

Recommended