Gephi Tutorial: How to Import Data into Gephi

  • Upload
    clement

  • View
    127

  • Download
    3

Embed Size (px)

DESCRIPTION

A tutorial for Gephi: how to Import Data

Citation preview

  • Gephi focus on data import

    V 1.2 Sept 2014

    Clment Levallois Gephi Support Team and Assist. Prof. @EMLYON

    1

  • Clement Levallois 2

    Bio notes

    Education in economics, management, history of science (Ph.D.) Turned to digital methods for research.

    data visualization, network analysis, natural language processing, web applications and more.

    Member of the Gephi Community Support team Gephi certified trainer https://marketplace.gephi.org/service/data-analysis/

    Contact, feedback welcome: on twitter @seinecle or

    www.clementlevallois.net

  • Clement Levallois 3

    1. Network already formed

  • Clement Levallois 4

    Network already formed? Import is direct

  • Clement Levallois 5

    Network data in Excel format

    - Via the data laboratory

    - Needs at least a list of edges

    - If you dont have a list of nodes, they will be created automatically fro the nodes mentioned in the edges list - just tick the option for nodes creation when prompted in the edge import wizzard!

  • Clement Levallois 6

    Frequent mistakes when importing data in the data lab:

    - Take extra caution with the headers (first line) of your files! At least: - For edges: Source,Target - For nodes: Id,Label

    - Take extra caution with the format of fields!

    - Check that your numeric attributes (as opposed to textual) are not imported as String (which means text), but as numbers indeed (Float, Double or Integer)

    - If you import a list of nodes and edges

    - Import the list of nodes first!

    - If you import a list of edges only - Check that the option for the creation of missing nodes is selected

  • Clement Levallois 7

    2. Data not in network format yet

  • Clement Levallois 8

    The general idea

  • Clement Levallois 9

    The steps YOU START WITH

    List / table of observations

    1. Extract connected entities (a network!)

    2. Format this network in a conventional standard

    3. Display the network with a software package / in the

    browser

    Not many solutions!

    Many solutions

    Many solutions

  • Clement Levallois 10

    Gaze: similarity computer

    Network connecting universities, with weak or strong ties depending on the similarity in terms of journals where these univ. publish

    Gaze can be downloaded from www.clementlevallois.net

    Available soon as a Gephi plugin (easier to use, works on Macs) Contact me at [email protected] if you need it now.

  • Clement Levallois 11

    Creating a network from co-occurrences data in Excel / CSV

    Network made of authors and literary agents

    Network of article co-authors

    Network formed by co-authors and the journals where they publish

    A plugin to install directly from Gephi (Tools -> Plugins)

    See next slides for further instructions on how to use this plugin

  • Clement Levallois 12

    1. Install the Plugin 1. Tools -> Plugin

    2. Select Convert Excel and csv files in the tab Available plugins, then install it and restart Gephi

  • Clement Levallois 13

    2. Select Import Spigot, then Data importer and follow the wizzard

  • Clement Levallois 14

    Example of a dataset

    This Excel file can be converted into a network where: - Followers are connected to followers - Followers are connected to persons being followed - Time periods are taken into account. Note on proper formats for the Date column: - Can be an interval of dates, with the start date and end date separated by a comma (like above) - Can be an interval of years, also separated by a comma (2012,2014) - Can be a simple date (2012-12-12) or simple year (1979)

  • Clement Levallois 15

    Create a dynamic network via the data laboratory

    Need two attributes for the nodes: An attribute containing the apparition date for the nodes Another one for the disappearance date for the nodes

    Use the fusion function to create a time interval from these 2 attributes

    (see next slide)

    Same principle to create dynamic edges.

    This approach has severe limits: only the topology of the network is dynamic, not the attributes (which stay fixed). And the dynamic topology is very limited a node can just appear and disappearm thats all.

    This is a case where a programmatic approach would bring much more flexibility.

  • Clement Levallois 16

    The function Merge columns

    1.

    2. Select the 2 columns which contain the dates of beginning and end (date or simple number, works as well)

    3. Select Create time interval

    Click on this button

  • Clement Levallois 17

    Last step: create time interval

    It is possible to leave the end date empty.

    If your temporal data are in the format of a date, be cautious to register the correct format, failing to do so will crash the creation and probably need you to restart Gephi to re-initialize.

    Thats all! You can then switch to the Overview, activate the timeline and animate the network

  • Clement Levallois 18

    Next step: Gephi advanced functions

    Slide Number 1Bio notes1. Network already formedNetwork already formed?Import is directNetwork data in Excel formatFrequent mistakeswhen importing data in the data lab:2. Data not in network format yetThe general ideaThe stepsGaze: similarity computerCreating a network from co-occurrences data in Excel / CSV1. Install the Plugin2. Select Import Spigot, then Data importer and follow the wizzardExample of a datasetCreate a dynamic networkvia the data laboratoryThe function Merge columnsLast step: create time intervalNext step:Gephi advanced functions