27
MY WEB INTELLGENCE Digital Humanitie s research issues

Digital Humanities research issues

Embed Size (px)

Citation preview

Page 1: Digital Humanities research issues

MY WEB INTELLGENCE

Digital Humanities research issues

Page 2: Digital Humanities research issues

AN OPEN SOURCE PLATFORM AS FOUNDATION FOR DH TOOLS

An open source platform to bind them all

Digital Humanities are a major challenge to social sciences: adding ITs in extracting, archiving, automated analysis, corpus qualification, data visualization…

Page 3: Digital Humanities research issues

AN OPEN SOURCE PLATFORM AS FOUNDATION FOR DH TOOLS

An open source platform to bind them all

Digital Humanities are a major challenge to social sciences: adding ITs in extracting, archiving, automated analysis, corpus qualification, data visualization…

CREATE A UNIFICATION DYNAMICSToo many one-shot projects, high-value innovations without consolidating experience. One platform to bind them all ?

Page 4: Digital Humanities research issues

AN OPEN SOURCE PLATFORM AS FOUNDATION FOR DH TOOLS

An open source platform to bind them all

Digital Humanities are a major challenge to social sciences: adding ITs in extracting, archiving, automated analysis, corpus qualification, data visualization…

CREATE A UNIFICATION DYNAMICSToo many one-shot projects, high-value innovations without consolidating experience. One platform to bind them all ?

OPEN GOVERNANCE FROM DAY ONEMy Web Intelligence is built around collaborative tools (Github, Trello, etc.). These have been public from day one, thus publicizing all research progresses.

Page 5: Digital Humanities research issues

AN OPEN SOURCE PLATFORM AS FOUNDATION FOR DH TOOLS

An open source platform to bind them all

Digital Humanities are a major challenge to social sciences: adding ITs in extracting, archiving, automated analysis, corpus qualification, data visualization…

CREATE A UNIFICATION DYNAMICSToo many one-shot projects, high-value innovations without consolidating experience. One platform to bind them all ?

OPEN GOVERNANCE FROM DAY ONEMy Web Intelligence is built around collaborative tools (Github, Trello, etc.). These have been public from day one, thus publicizing all research progresses.

FOR THE COMMON GOODMy Web Intelligence is meant to be as common as possible so that intelligence tools benefit to all (easy to install, well documented, etc.)

Page 6: Digital Humanities research issues

AN OPEN SOURCE PLATFORM AS FOUNDATION FOR DH TOOLS

An open source platform to bind them all

Digital Humanities are a major challenge to social sciences: adding ITs in extracting, archiving, automated analysis, corpus qualification, data visualization…

CREATE A UNIFICATION DYNAMICSToo many one-shot projects, high-value innovations without consolidating experience. One platform to bind them all ?

OPEN GOVERNANCE FROM DAY ONEMy Web Intelligence is built around collaborative tools (Github, Trello, etc.). These have been public from day one, thus publicizing all research progresses.

FOR THE COMMON GOODMy Web Intelligence is meant to be as common as possible so that intelligence tools benefit to all (easy to install, well documented, etc.)

COLLABORATIVE FIRSTMy Web Intelligence chooses openness and collaboration to answer the challenges posed by new technologies and media.

Page 7: Digital Humanities research issues

The content manager : the heterogeneous archive management challenge

Allow SHS study the digital humanities is offer primarily a platform able to extract and retain huge amounts of expressions from heterogeneous sources.

MASTERING THE EXTRACTION AND ARCHIVING AGENTS (CRAWLERS) AMID BIG DATA.

Page 8: Digital Humanities research issues

The content manager : the heterogeneous archive management challenge

Allow SHS study the digital humanities is offer primarily a platform able to extract and retain huge amounts of expressions from heterogeneous sources.

AUTOMATICALLY EXTRACTING CORPUS ON THE NEEDSGive a crawler accessing heterogeneous sources with enough modularity to meet all users' projects

MASTERING THE EXTRACTION AND ARCHIVING AGENTS (CRAWLERS) AMID BIG DATA.

Page 9: Digital Humanities research issues

The content manager : the heterogeneous archive management challenge

Allow SHS study the digital humanities is offer primarily a platform able to extract and retain huge amounts of expressions from heterogeneous sources.

AUTOMATICALLY EXTRACTING CORPUS ON THE NEEDSGive a crawler accessing heterogeneous sources with enough modularity to meet all users' projects

GIVE A USER INTERFACE TO MANAGE CORPUS.Clean, delete, sort, rearrange, according to its own heuristics, is a must for any DH project.

MASTERING THE EXTRACTION AND ARCHIVING AGENTS (CRAWLERS) AMID BIG DATA.

Page 10: Digital Humanities research issues

The content manager : the heterogeneous archive management challenge

Allow SHS study the digital humanities is offer primarily a platform able to extract and retain huge amounts of expressions from heterogeneous sources.

AUTOMATICALLY EXTRACTING CORPUS ON THE NEEDSGive a crawler accessing heterogeneous sources with enough modularity to meet all users' projects

GIVE A USER INTERFACE TO MANAGE CORPUS.Clean, delete, sort, rearrange, according to its own heuristics, is a must for any DH project.

A COLLABORATIVE MANAGEMENT TOOLS FOR DATA STUDIES.We did not win the HD challenge alone. A platform of this ambition will integrate a team management module to the data processing service .

MASTERING THE EXTRACTION AND ARCHIVING AGENTS (CRAWLERS) AMID BIG DATA.

Page 11: Digital Humanities research issues

The content manager : the heterogeneous archive management challenge

Allow SHS study the digital humanities is offer primarily a platform able to extract and retain huge amounts of expressions from heterogeneous sources.

AUTOMATICALLY EXTRACTING CORPUS ON THE NEEDSGive a crawler accessing heterogeneous sources with enough modularity to meet all users' projects

GIVE A USER INTERFACE TO MANAGE CORPUS.Clean, delete, sort, rearrange, according to its own heuristics, is a must for any DH project.

A COLLABORATIVE MANAGEMENT TOOLS FOR DATA STUDIES.We did not win the HD challenge alone. A platform of this ambition will integrate a team management module to the data processing service .

RECRUITING INTELLIGENT AGENTSThe democratization of machine learning and artificial intelligence now allows hiring processing algorithms to assist you in the mass management of your data.

MASTERING THE EXTRACTION AND ARCHIVING AGENTS (CRAWLERS) AMID BIG DATA.

Page 12: Digital Humanities research issues

Analysis of content : The challenge of automating the qualification.

The language processing has made enormous progress. However some open solution provides opportunities to qualify the body of masses. Our project aims to bring together the foundations of research in this area.

QUALIFY AUTOMATICALLY DATA ABOUT COMMUNICATION SITUATIONS .

Page 13: Digital Humanities research issues

Analysis of content : The challenge of automating the qualification.

The language processing has made enormous progress. However some open solution provides opportunities to qualify the body of masses. Our project aims to bring together the foundations of research in this area.

QUALIFY THE COMMUNICATION SITUATION .Each expression have to be contextualized in a mediated communication situation and need to be qualified automatically.

QUALIFY AUTOMATICALLY DATA ABOUT COMMUNICATION SITUATIONS .

Page 14: Digital Humanities research issues

Analysis of content : The challenge of automating the qualification.

The language processing has made enormous progress. However some open solution provides opportunities to qualify the body of masses. Our project aims to bring together the foundations of research in this area.

QUALIFY THE COMMUNICATION SITUATION .Each expression have to be contextualized in a mediated communication situation and need to be qualified automatically.

ANALYZE THE IMPACT OF ACTS DISCURSIVE .Save impact indicators of all expressions to be able to not only measure their influence but their resonance with the representations of the receivers of the message.

QUALIFY AUTOMATICALLY DATA ABOUT COMMUNICATION SITUATIONS .

Page 15: Digital Humanities research issues

Analysis of content : The challenge of automating the qualification.

The language processing has made enormous progress. However some open solution provides opportunities to qualify the body of masses. Our project aims to bring together the foundations of research in this area.

QUALIFY THE COMMUNICATION SITUATION .Each expression have to be contextualized in a mediated communication situation and need to be qualified automatically.

ANALYZE THE IMPACT OF ACTS DISCURSIVE .Save impact indicators of all expressions to be able to not only measure their influence but their resonance with the representations of the receivers of the message.

ANALYZE THE CONTENT AUTOMATICALLY.Lemmatization of texts, the main objects of expressions, arguments trees ... The content analysis allows automatic classification of the corpus serving the detection of collective representations.

QUALIFY AUTOMATICALLY DATA ABOUT COMMUNICATION SITUATIONS .

Page 16: Digital Humanities research issues

Analysis of content : The challenge of automating the qualification.

The language processing has made enormous progress. However some open solution provides opportunities to qualify the body of masses. Our project aims to bring together the foundations of research in this area.

QUALIFY THE COMMUNICATION SITUATION .Each expression have to be contextualized in a mediated communication situation and need to be qualified automatically.

ANALYZE STYLISTIC FORMS TO IDENTIFY PATTERNS SPEAKER .The style , feeling, language level , type of vocabulary ... the detection of styles enriches patterns of speakers to better identify their intention of communication.

ANALYZE THE IMPACT OF ACTS DISCURSIVE .Save impact indicators of all expressions to be able to not only measure their influence but their resonance with the representations of the receivers of the message.

ANALYZE THE CONTENT AUTOMATICALLY.Lemmatization of texts, the main objects of expressions, arguments trees ... The content analysis allows automatic classification of the corpus serving the detection of collective representations.

QUALIFY AUTOMATICALLY DATA ABOUT COMMUNICATION SITUATIONS .

Page 17: Digital Humanities research issues

The algorithms of speech: At the source of the positions.

The generation of discourse responds to more or less stereotyped behaviors. The algorithms that detect patterns used to measure but also to predict them...

DETECTION AND MEASURING PATTERNS FROM SOURCE OF SPEECH TO UNDERSTAND THE ECONOMY GENERATIVE.

Page 18: Digital Humanities research issues

The algorithms of speech: At the source of the positions.

The generation of discourse responds to more or less stereotyped behaviors. The algorithms that detect patterns used to measure but also to predict them...

ANALYZE THE POSITIONS OF SPEAKERS .By the qualification of expressions depending on the discursive act model, it is possible to quantify the production of discourse through multi varied statistical processing (type AFC , ACP, trees ... )

DETECTION AND MEASURING PATTERNS FROM SOURCE OF SPEECH TO UNDERSTAND THE ECONOMY GENERATIVE.

Page 19: Digital Humanities research issues

The algorithms of speech: At the source of the positions.

The generation of discourse responds to more or less stereotyped behaviors. The algorithms that detect patterns used to measure but also to predict them...

ANALYZE THE POSITIONS OF SPEAKERS .By the qualification of expressions depending on the discursive act model, it is possible to quantify the production of discourse through multi varied statistical processing (type AFC , ACP, trees ... )

PREDICTING PRODUCTION OF SPEECHPredictive algorithms enable not only qualify incomplete data but also to generate hypotheses about future positions taken by developing future scenarios

DETECTION AND MEASURING PATTERNS FROM SOURCE OF SPEECH TO UNDERSTAND THE ECONOMY GENERATIVE.

Page 20: Digital Humanities research issues

The algorithms of speech: At the source of the positions.

The generation of discourse responds to more or less stereotyped behaviors. The algorithms that detect patterns used to measure but also to predict them...

ANALYZE THE POSITIONS OF SPEAKERS .By the qualification of expressions depending on the discursive act model, it is possible to quantify the production of discourse through multi varied statistical processing (type AFC , ACP, trees ... )

PREDICTING PRODUCTION OF SPEECHPredictive algorithms enable not only qualify incomplete data but also to generate hypotheses about future positions taken by developing future scenarios

THE SOCIAL NETWORK ANALYSIS AS SOCIAL CONTEXT OF SPEECHThe structural analysis of networks applied to the analysis of discourse in their co-citation retrieves the frame that binds and socializes enunciators.

DETECTION AND MEASURING PATTERNS FROM SOURCE OF SPEECH TO UNDERSTAND THE ECONOMY GENERATIVE.

Page 21: Digital Humanities research issues

The algorithms of speech: At the source of the positions.

The generation of discourse responds to more or less stereotyped behaviors. The algorithms that detect patterns used to measure but also to predict them...

ANALYZE THE POSITIONS OF SPEAKERS .By the qualification of expressions depending on the discursive act model, it is possible to quantify the production of discourse through multi varied statistical processing (type AFC , ACP, trees ... )

SNA AS THE ANALYSIS OF COGNITIVE STRUCTURES OF SPEECH.The SNA provides a new perspective in the analysis of the argumentative co-presence in the large corpus by introducing its own notions (centrality, betwenness, etc.).

PREDICTING PRODUCTION OF SPEECHPredictive algorithms enable not only qualify incomplete data but also to generate hypotheses about future positions taken by developing future scenarios

THE SOCIAL NETWORK ANALYSIS AS SOCIAL CONTEXT OF SPEECHThe structural analysis of networks applied to the analysis of discourse in their co-citation retrieves the frame that binds and socializes enunciators.

DETECTION AND MEASURING PATTERNS FROM SOURCE OF SPEECH TO UNDERSTAND THE ECONOMY GENERATIVE.

Page 22: Digital Humanities research issues

Data visualization: The look as a source of intelligence ?

The data visualization challenge is to give interpretive schemes for large masses of data in a specific study context. My Web Intelligence explore the relationship between visualization and digital expression.

VIEW AND INTERPRETING DIGITAL EXPRESSIONS WEB.

Page 23: Digital Humanities research issues

Data visualization: The look as a source of intelligence ?

The data visualization challenge is to give interpretive schemes for large masses of data in a specific study context. My Web Intelligence explore the relationship between visualization and digital expression.

NAVIGATING THE CORPUS OF EXPRESSION.View and Navigate the expressions through dashboards of act of enunciation (type, media, speakers, hearing, etc.).

VIEW AND INTERPRETING DIGITAL EXPRESSIONS WEB.

Page 24: Digital Humanities research issues

Data visualization: The look as a source of intelligence ?

The data visualization challenge is to give interpretive schemes for large masses of data in a specific study context. My Web Intelligence explore the relationship between visualization and digital expression.

NAVIGATING THE CORPUS OF EXPRESSION.View and Navigate the expressions through dashboards of act of enunciation (type, media, speakers, hearing, etc.).

SORT AND INDEXING CONTENT.Explorer viewing by keyword clouds, dynamic indexes and other representations of the text to facilitate the conceptual analysis.

VIEW AND INTERPRETING DIGITAL EXPRESSIONS WEB.

Page 25: Digital Humanities research issues

Data visualization: The look as a source of intelligence ?

The data visualization challenge is to give interpretive schemes for large masses of data in a specific study context. My Web Intelligence explore the relationship between visualization and digital expression.

NAVIGATING THE CORPUS OF EXPRESSION.View and Navigate the expressions through dashboards of act of enunciation (type, media, speakers, hearing, etc.).

SORT AND INDEXING CONTENT.Explorer viewing by keyword clouds, dynamic indexes and other representations of the text to facilitate the conceptual analysis.

MAPPING THE SOURCES OF INFORMATION.The mapping of the speakers enables contextual navigation supports media by analyzing their relevant relationships as social context of utterance.

VIEW AND INTERPRETING DIGITAL EXPRESSIONS WEB.

Page 26: Digital Humanities research issues

Data visualization: The look as a source of intelligence ?

The data visualization challenge is to give interpretive schemes for large masses of data in a specific study context. My Web Intelligence explore the relationship between visualization and digital expression.

NAVIGATING THE CORPUS OF EXPRESSION.View and Navigate the expressions through dashboards of act of enunciation (type, media, speakers, hearing, etc.).

MAPPING COLLECTIVE REPRESENTATIONS.The use of SNA in concept mapping offers the prospect of a new visualization of collective representations and therefore the context of knowledge and episteme studied sayings.

SORT AND INDEXING CONTENT.Explorer viewing by keyword clouds, dynamic indexes and other representations of the text to facilitate the conceptual analysis.

MAPPING THE SOURCES OF INFORMATION.The mapping of the speakers enables contextual navigation supports media by analyzing their relevant relationships as social context of utterance.

VIEW AND WEB. INTERPRETING DIGITAL EXPRESSIONS

Page 27: Digital Humanities research issues

MY WEB INTELLIGENCEArchitecture Patterns Issues

PROJECT MANAGER (territories and requests)

ORACLESfirst list of approved

expression for starting the graph`

READERDonwload and index the

document like an expression

CRAWLERDeep crawling web

SCRAPPERRead heteregenous files for build an expression

APPROBATIONAlgorithm for approving

linked expression

QUALIFICATIONEnrich expression and

domain with data

RANKINGBuild Kpi to rank

expression and domain

APIsMaking a bridge with 3rd

Soft

EXPORT FILESCSV, GEXPH and all kind

of models

VIZUALISATIONUse Viz librairies to navigate in this data

(graph, tree, etc.)

Input : absorb the corpus Annotate : qualify your datas

Output : show the patterns

The My Web Intelligence challenge is to absorb heterogeneous corpus and archive and index them in a model of Author -Media- expression data.

This is both to respect the specificity of media but at the same time to use the communication meta analysis models to analyze both, meaning of speech, but also the pragmatic analyzes of communicative acts

It's not about only what is said in a kind of naive sociology but also to understand the social dynamics and strategies at work in the production of meaning .

The second part of the development of My Web Intelligence is to develop intelligent agents able the playing the role of librarians in their functions :

- Analysis of the relevance of the document in respect of the project. One of the major issues in the management of datas is the cleaning of noise and digital debris. But beyond that, the relevance to the research issues are the key usability HD platforms.

- Data Enrichment (or annotation). The second role of "librarian function" of each research project is to work , annotate, as described by algorithms that human agents , both by external and internal sources.

My Web Intelligence thinks like a Core Framework in the ecosystem of data analysis projects. A major issue is the interconnection of the process to third-party applications in both upstream inputs/outputs .

Designing an API outputs/inputs to make compatible third management solutions and data processing.

Facilitate the production and export of BD -compatible files with the processing in third-party software (ex . Gephi , SPSS , R, etc. )

Use data visualization solutions to navigate , process and analyze large corpus and identify meaningful patterns.