NWO-MVI Proposal 020908

8/14/2019 NWO-MVI Proposal 020908

1/8

Emerging qualityCreating dynamic user and content profiles in online knowledge networks

Thieme Hennis

02-Sep-08


2/8

Type, theme, and length of projectClassification: 1A This project can be classified as research treating ethical and societal aspects of concrete

technological developments.

A virtual identity has both psychological and economical significance. In my research, the motivation to share

knowledge and help others is an important aspect of the way the virtual identity of an individual is built. It alsopresumes and supports other economical structures, such as a more flexible employer-employee relationship.

Theme: Virtual realityThe Internet has become an essential element of many economies and societies. Similarly, we are intrinsically

attached to and have become part of the Web (Kelly, 2005). The last few years, we have seen an enormous

increase in people being active online; connecting, creating, sharing, and building up identities. Smart data-mining

systems are able to create dynamic profiles of people and content representing expertise, relevance, and quality.

Internet pioneer Wendy Hall describes it as follows;

Every time you do something on the internet, it is effectively logged, building up this profile that is with you

for your life. We will be able to build software that can interpret that profile to help get the answer that

you need in the context that youre in (Smith, 2006).

Length of research: 4 yearsThe author applies for a single Ph.D. research (length: 4 years).

Research teamQuality is a socially defined concept. I will try to make it quantifiable by measuring certain use & user relationships

within decentralized networks. At present, the research team consists of the following people;

Professor Wim Veen TU Delft Wim has been involved in research into learning and innovation in

education for many years. He developed a very relevant and useful model concerning networked learning.

Alpha (sociological-educational theories)

Dr. Jaco Appelman TU Delft Jaco has been involved in collaboration software research for many years.

He will assist with methodological and content issues. Beta (collaborative software & systems engineering)

Job Timmermans, M.Sc. PEERS Job is co-founder of PEERS, has finished master degrees in philosophy

and systems engineering. His role in the project is to discuss the true practical application of the developed

system. Alpha (philosophy of quality) & Beta (collaborative software & application interface)

Research descriptionIn decentralized (virtual) networks with tools and technologies that allow anyone to contribute anything, it is

increasingly problematic to determine reliability of content and people online. The research I propose must bring

forward rules and variables that can be used (by for example software engineers) to let quality and expertise

emerge over time and be visible.

What represents quality?In the last decade, the internet has evolved into a platform allowing any person to participate and contribute.

Various easy-to-use web technologies empower people to share interests and knowledge, search and structure

content, and connect with friends and peers.


3/8

Web 2.0 represents a blurring of the boundaries between Web users and producers, consumption and

participation, authority and amateurism, play and work, data and the network, reality and virtuality

(Zimmer, 2008).

With the increase of online participation, a number of issues have emerged regarding quality, authority, expertise,

and trust (Keen, 2007). With organizations becoming more open and seeking ways to make use of the contributions

of people around the world, these issues become even more prevalent (Abbott, 2000). As there are many newtools for publishing and creating new content, there are tools that are specifically made to search, filter, rate,

evaluate and recommend content to people in certain contexts. Still, f iltering through more and more resources

hidden online or in internal networks, remains difficult (Benkler, 2006; Howe, 2006).

Finding the right resources and people

Search algorithms of popular search engines focus on popularity (or authority) rather than on what is commonly

regarded as quality. In this process, no human reviews are involvedand thus created in a sub-optimal manner

(Lewandowski & Hchsttter, 2008). Because similar search algorithms and ineffective content management

systems are used within organizations, most of the time spent by knowledge workers is spent in recreating already

existing (and in-house available) information;

A lot of money and intellectual power is spent on reinventing the wheel and searching for knowledge. This

is a huge problem for companies and a central challenge for KM research (Swaak, Ifamova, Kempen, &

Graner, 2004).

People define quality. Usually this involves relying on others, such as experts or people you trust. This should also

be the case for the way search engines and content management systems determine quality. In specific; this means

the inclusion of human reviews and other metadata generated by people (times used, favorited, tagged or

recommended) in structuring and managing content. In doing that, quality is linked to context and more

transparent for the user and related to certain context variables (which may be user input variables in search

engines).

StandardsA number of initiatives, such as PICS (Resnick & Miller, 1996; Armstrong, 1997) and Resource Profiles (Downes,

2005) propose protocols or frameworks that can be used to evaluate, rate, or structure online content. Many

websites have implemented rating and reputation mechanisms to increase transparency and indicate trust in

content and people. Still, a general standard for online content does not exist.

Wikipedia co-founder Larry Sanger has recently called for a system for syndicating and rating online data, claiming

it to be the obvious next step (and Big Idea) for the Internet. It will enable systems to weight data not just on

Google-style PageRank algorithms, but also things like

quality according to generally trusted sources; or quality according to your peer group; or quality according

to academic and academic-endorsed sources; etc. (Sanger, 2008)

What Sanger proposes, is a system that includes relevant information of a person with the rating in order to add

context and enrich the information about pieces of content with relevant metadata (such as quality according to

peer group, evaluations, usage).

As the Web is populated with more data, it becomes easier to automatically mine these kinds of user and usage

statistics about people and their behavior online, popularity and interest, friends and activities and turn into

valuable metadata. For example, APML (Attention Profile Markup Language) and ULML (User Labor Markup


4/8

Language) intend to set standards for capturing and sharing information about people online. When you combine

this people metadata with active feedback generated by users (through rating and evaluations), profiles of people

and content can be made automatically (through use) that can be used to increase motivation to contribute and

share, enhance flexibility for freelance workers and organizations, and improve efficiency in finding people and

content (Choi, Kruk, Grzonkowski, Stankiewicz, Davis, & Breslin, 2006).

Hypothesis, research questions, instruments, and methodologyBoth user profile (expertise level & domain) and usage (number of views, clicks, ratings, recommendations, etc.)

are relevant and should be utilized to determine quality and relevance of content. Furthermore, using this

information for profiling the original contributor leads to a system

of dynamic and up-to-date expertise profiles based on the value of

contributions. The hypothesis I want to falsify is that in virtual

knowledge networks, findable expert and content profiles can be

made by analyzing how content is used, and by whom, and linking

the results to the original contributor. The two fundamental

assumptions that make up the hypothesis are:

1. User and usage information determine quality and domain of creations; and2. Quality of creations determine the creators expertise.

About these two important assumptions, a lot has been written and done. The recent increase of people being

active online and sharing content allows for complex data retrieval and profiling algorithms for dynamically

determining quality. How this translates into research is described in the next section.

Research questions and instruments

As mentioned, the above assumptions are addressed by various reputation systems and rating, quality, and

profiling mechanisms. I will first investigate the most important relationships, rules, and (upcoming) standards in

the generation of metadata about quality, authority, and expertise by these (stand-alone) systems. Concurrently I

will look into processes of knowledge workers using different types of publishing and rating tools, and find out themost important variables of quality in knowledge networks. These variables are then ordered along different levels:

i.e. personal (expertise, competencies, passion, etc.), relational (quality of interactions), and informational

(usefulness of content, reliability source). This first step of literature research and case-studies is inductive, and

results in a model that will be validated by doing a large-scale survey. Through regression analysis personal biases

will be filtered out and an empiric foundation will be created for the interpretatively developed model.

Step Research question(s) or description Instrument /

Method

Outcome

1. CONTENT PROFILES: What are variables and (metadata)standards and initiatives for defining quality of content?

User-driven: active rating and evaluation

Machine-driven: measuring usage

Literature, desk-

research

Authoritative paper(s) about

Metadata standards and

quality in decentralized

online networks AND/ORMetadata standards,

profiling mechanisms and

authority/expertise protocols

and rules in decentralized

online networks.

2. USER PROFILES: What are variables and (metadata)standards and initiatives for defining expertise of persons?

User-driven: recommendations etc.

Machine-driven: determination of authority (based

on several factors)

Idem

3. A first case study will provide insight into criteria,possibilities and constraints of using different tools.

How can the standards and variables be measured,

using existing tools?

Case study:

interview, survey,

experiment

Criteria, possibilities, and

constraints & toolbox.


5/8

4. Using the outcomes of the three steps, I will describe the most important variables and requirements fordetermining quality and expertise in online networks.

How can user-driven and machine-driven metadata about quality of content translate to dynamic expertise

profiles of content creators (or: How should content-profiles influence user-profiles?)

How should the expertise-profiles influence content-profiles?

Additionally, I will clarify the requirements for the case studies and the research that follows, to test the hypothesis.

These requirements include instruments/technologies used, user-participation, size of network, and more.

5. VALIDATION MODEL: Does the interrelation of content-metadata and user-metadata in determining quality and

expertise improve finding of people and resources in

organizations?

What are critical success factors?

Case study:interview, survey

Framework for measuringsearch quality within

organizations & Critical

success factors for the model

6. Describing the outcomes of the research. Report and functional designfor the proposed system.

Timeline

The steps in the above table are ordered chronologically. The timeline below describes the structure in more detail:

1. Year 1: Step 1, 2, 3 Literature research, creating research framework and quality model and theory,conducting an exploratory case study, preparing further case studies and writing papers.

2. Year 2: Step 4 & 5 Developing and deploying the model in research communities and evaluation ofmodel. More specifically;

Describing how different tools are used to create and share information, and how these tools

define quality/expertise.

Evaluating and refining the model and theory. This means describing (i) how usage (popularity,

rating, reviewing, etc.) and users (experts versus laymen) together determine quality of content,

and (ii) how this translates to the expertise or authority of the content creator.

3. Year 3: Step 5 Similar to the second, but with more focus on converging research results in order tocreate an improved and more abstract model for quality and expertise in online knowledge networks. The

two main requirements are that the model functions as desired and that it can be used as a basis for

creating metadata generating software.

4. Year 4: Step 6 Describing and finalizing my research: make it useful for practical solutions.Methodology; Grounded Theory

Because I will develop a new theory about quality based on existing literature and research, the chosen

methodology is grounded theory. Grounded theory can be described as a research method in which the theory is

developed from the data, rather than the other way around. It is an inductive approach, meaning that it moves

from the specific to the more general. Because theories for virtual identities, quality and rating systems, and

constitutions around the increased empowerment of people are currently taking shape, this is the best approach:

utilizing it to create a better model.

Societal impact and valorizationMy objective is to create a system that measures peoples activities and contributions online and automatically

translates this to a virtual identity (or karma) that can be found by the right persons in the right context. Such asystem allows people to be found and employed more directly and flexibly (Malone & Laubacher, 1998). Depending

on how efforts are valued and used by community, the virtual identity of the contributor changes. I suppose this

leads to two things;

People contribute valuable content to community (otherwise it will not add value to their ID);

People are more intrinsically motivated to contribute (fun, community feeling) rather than by financial

reward. Still, the virtual ID forms a bridge to future job opportunities or assignments based on

(motivation-based) contributions.


6/8

Such a system will change organizational structures, and create a more flexible and free economy, as speculated by

Pekka Himanen:

Could there be a free market economy in which competition would not be based on controlling information

but on other factors an economy in which competition would be on a different level (and, of course, not

just in software, but in other fields, too)?

Pekka Himanen; the Hacker Ethic and the Spirit of the Information Age (2001)

Competition, then, would be then based on the contrary, the sharing of information and resources between people

and in flexible networks and communities. I know that this is another testable assumption, but that could be done

in further research. Before we can do that though, we must build the foundation of this system.

Case study; Sustainable network

My analysis of quality and expertise in virtual environments (like online communities) will be the basis of PEERS

Interaction Management System1; software analyzing interaction of users with each other and online content. All

described relationships, rules, and standards will be built in it , so it can be tested and applied immediately.

Currently, we are deploying our software at different organizations in different settings. The following will serve as

exploratory case study in the research;

Sustainability network (100-250 professionals) consisting of DKA (De Kleine Aarde), Enviu, OSIRIS, and the

TU Delft Sustainability department (SEPAM faculty). These organizations, concerned with sustainability

and alternative technologies, have clearly expressed their interest and commitment to contribute and be

part of the proposed research. I will deploy different software tools within these organizations, and use

PEERS Interaction Management System to create dynamic exchangeable profiles of people and content.

They allow users to make use of content and connect with people outside of their own organization. Tools

and technologies already used by the organizations will be part of the research, if they allow measurement

of use and users by PEERS IMS.

1http://aboutpeers.com
http://aboutpeers.com/http://aboutpeers.com/http://aboutpeers.com/http://aboutpeers.com/


7/8

Works CitedAbbott, V. (2000). Web page quality: can we measure it and what do we find? A report of exploratory findings.J

Public Health, 22 (2), 191-197.

Armstrong, C. (1997, May 19). Metadata, PICS and Quality. Retrieved August 10, 2008, from Ariadne magazine:

http://www.ariadne.ac.uk/issue9/pics/

Benkler, Y. (2006). Wealth of Networks; How Social Production Transforms Markets and Freedom. New Haven, CT:

Yale University Press.

Choi, H. C., Kruk, S. R., Grzonkowski, S., Stankiewicz, K., Davis, B., & Breslin, J. G. (2006). Trust Models for

Community-Aware Identity Management. Identity, Reference, and the Web Workshop at the WWW Conference,

2006.

Downes, S. (2005). Resource Profiles.Journal of Interactive Media in Education, 5.

Himanen, P. (2001). The Hacker Ethic and the Spirit of the Information Age. New York: Random House.

Howe, J. (2006, June). The Rise of Crowdsourcing. Retrieved August 10, 2008, from Wired Magazine (14):

http://www.wired.com/wired/archive/14.06/crowds.html

Keen, A. (2007). The Cult of the Amateur. New York: Doubleday Business.

Kelly, K. (2005, August). We Are the Web. Retrieved August 08, 2008, from Wired Magazine (13):

http://www.wired.com/wired/archive/13.08/tech.html

Lewandowski, D., & Hchsttter, N. (2008). Web Searching: A Quality Measurement Perspective. In A. Spink, & M. (.

Zimmer, Web Searching: Interdisciplinary Perspectives (pp. 309-343). Dordrecht: Springer.

Malone, T., & Laubacher, R. (1998, September-October). The dawn of the E-lance economy. Harvard Business

Review, 144-152.

Resnick, P., & Miller, J. (1996). PICS: Internet Access Controls Without Censorship. Communications of the ACM, 39,

87-93.

Sanger, L. (2008, July 8). Syndicated Web ratings - an idea whose time has come?Retrieved August 8, 2008, from

Citizendium Blog: http://blog.citizendium.org/2008/07/09/syndicated-web-ratings-an-idea-whose-time-has-come/

Smith, D. (2006, May 21).All set for a baby.com revolution. Retrieved August 10, 2008, from Guardian - The

Observer: http://www.guardian.co.uk/technology/2006/may/21/news.theobserver

Swaak, J., Ifamova, L., Kempen, M., & Graner, M. (2004). Finding in-house knowledge: patterns and implications. I-

KNOW04. Graz, Austria: Telematica Institute. Available at https://doc.telin.nl/dscgi/ds.py/Get/File-40767.

Zimmer, M. (2008). Preface: Critical Perspectives on Web 2.0. First Monday (online), 13 (3).


8/8

Preliminary budgetAs yet, I request the full amount needed to complete this research: 300.000 for a fulltime (4-year) PhD position,

including research team, logistics and travel support, accommodation and all other expenses.

Valorization workshopThe valorization workshop consists of 2 parts.

In November a modest online conference will be held. I will do this using free conferencing and

collaboration technologies. I will put 4 important questions forward which are addressed with by the

invited speakers (15 minutes per speaker). A discussion follows with participants with my research and

research question as the main topic.

An offline meeting will be held in December with all stakeholders, including individuals from PEERS,

research committee, and potential cases. Depending on the possibility of having this hosted by an

institution, a maximum of 1000 is needed to hire office space and arrange beverages.

Summary for laymen (in Dutch)De laatste 10 jaar heeft het internet zich ontwikkeld met technologien die mensen steeds beter in staat stelt om

content te maken, toe te voegen, en te beoordelen. Bijna elke persoon kan met behulp van een computer en een

internet verbinding zijn/haar passies, interesses, en kennis delen, en dat gebeurt dan ook. Verschillende

mechanismen bestaan om die overvloed aan content te filteren en categoriseren, maar het blijft erg moeilijk om

online of in virtuele netwerken het kaf van het koren te scheiden. Dit geldt voor zowel voor content (wat is

betrouwbaar/van hoge kwaliteit?) als voor mensen (is deze persoon echt een expert op dit gebied?).

De toename online activiteiten van mensen schept naast meer content, ook betere mogelijkheden om deze

content te structureren en waarderen. Dit kan op verschillende manieren:

Ten eerste kan het gebruik van content worden gemeten: dit is zowel het passieve lezen, als het actieve

structureren/beoordelen/evalueren van content;Ten tweede kan worden gemeten door wie de content gebruikt.

De hypothese is dat door zowel het gebruik als de gebruiker te meten en te analyseren, er hele specifieke en up-to-

date profielen gemaakt kunnen worden van content en mensen. Deze profielen zijn dynamisch en afhankelijk van

de activiteiten omtrent persoon of content. Naarmate iemand actiever is, krijgt deze een rijker profiel (hoeft niet

per se beter te zijn) en naarmate een stuk content meer gebruikt wordt, kan deze beter worden geprofileerd. Zon

systeem ondersteunt het decentraal en flexibel werken van kenniswerkers in virtuele of open organisaties.

Documents

NWO-MVI Proposal 020908