From the Data Processing section, you can export interview data, delete interviews from a study, and visualize, save, and export visualizations of interview data.
Select Data Processing on Main Menu |
Select Study From Data Processing List |
EgoWeb 2.0 isn’t really a network analysis package, though it does have some analysis capabilities. We designed it to export basic data sets that any stats software can import and analyze. EgoWeb 2.0 enables the collection of personal network data (among other forms), which are better analyzed with script-based statistics software (such as R) because they enable processing of many networks and multiple visualizations at one time. The export files are in comma delimited (csv) format.
The first step before exporting raw data is to create a definition of an edge/adjacency in the study you want to export. This is because the raw data are exported along with network statistics calculated using the definition of an edge/adjacency. If you defined an edge in order to display a visualization of the network as part of a network question, you can use the same expression for data exporting. For example, if you have an alter-pair question “Does alter A know alter B” to create an expression defining an edge based on this question, you have to define that a “yes” answer to this question is an adjacency and a “no” is not. This is necessary because you could have a more precise question (e.g. “How often does A interact with B? Never, Rarely, Sometimes, Often?”), and you may want to define an adjacency to include the answers “sometimes” or “often” and exclude the “rarely” and “never” responses (and therefore define them as non-adjacencies).
Options for Exporting Data With Stats |
Export Ego-alter Data:
Ego-Alter data exports respondent and network composition variable data. The data are structured one row per alter with network composition variables (questions about the alters) for each alter on one row apiece. The ego-level data are repeated over and over for each alter. The data are structured to facilitate dyadic multi-level analyses that can be analyzed with multi-level modeling procedures in statistics software such as STATA. Along with the raw ego-level and alter-level data, network statistics for the whole respondent network (e.g. density, isolates) and for each alter (e.g. betweenness, degree) are also included in the ego-alter data export.
Export Alter-Pair Data:
You can also export the Alter-Pair data, which is the result of any alter-alter question. This is in an edge list format and can be easily imported into a social network analysis software package such as UCINet.
Exporting Main Network Data |
Export Other Specify Data:
The “other specify” data are text typed into open text boxes for questions that were defined as “other specify.”
Export Pre-defined Alter List:
The Pre-defined alter list is not data, just a list of participants/alters used in the study. These are added to a study settings page.
Dyad Match:
“Dyad Match”: This is an in-development feature that enables matching alter names across interviews.
Delete Interviews:
Select interviews to remove from the data base.
Other Options in Data Processing |
The data exported with the “Export Ego-Alter Data” and the “Export Alter Pair Data” options are .csv files (comma separated values). The images below describe the order of the variables when looking at the file with a spreadsheet viewer such as MS Excel.
Ego-Alter Data:
From left to right in Excel, the order of the Ego-Alter data is 1) System variables, 2) EgoID variables, 3) raw ego level variables, 4) ego level network statistics variables, 5) alter number and names, 6) name generator(s) indicators, 7) alter level raw responses, 8) alter level network statistics.
Note On Raw Data format:
Most of the raw data export is straightforward. Numeric questions export the number that is entered into the entry box. Textual and Textual__PP questions also export the text entered into the box. Multiple selection questions export the number that is associated with the response option in the authoring section of the study. For multiple selection responses that allow for more than one option, these selections are concatenated together into one response separated by a semicolon. The individual responses can be exported into separate variables using data processing features that split apart text strings in other statistical and spreadsheet software. For example, in Excel, the feature “Data > Text to Columns” will create new columns for each response separated by a semicolon if the multiple selection column is highlighted and the “Delimited” option is chosen with the “semicolon” option. The R library “splitstackshape” has a function cSplit that will automatically convert a variable with a delimiter into several variables with an underscore and number appended to the name of the variable, one for each response up to the max number of responses.
For example, if the variable “SOC_SUPPORT” in the data set “egoweb.data” has multiple selections, the code below will replace the SOC_SUPPORT variable with SOC_SUPPORT_1, SOC_SUPPORT_2, SOC_SUPPORT_3, and so on in the object “egoweb.data”.
library(splitstackshape)
egoweb.data ← cSplit(egoweb.data, “SOC_SUPPORT”, sep=“;”, type.convert=TRUE)
Ego level Network Statistics:
Raw EgoWeb data can be exported and analyzed by any statistical software, including software specialized in analyzing social network software. EgoWeb was initially created for collecting personal network data and exports a small set of standard network statistics calculated for the responses for each individual interview. (For more information about measuring structure in personal networks, see McCarty 2002.) When an edge expression is selected, the Ego Alter csv file exports statistics at the ego level and at the level of each individual alter. The ego level stats are repeated in rows for each alter row for one respondent. The ego level statistics that are exported include: 1) Density, 2) Max Degree, 3) Max Betweenness, 3) Max Eigenvector Centrality, 4) Degree Centralization, 5) Betweenness Centralization, 6) Components, 7) Dyads, 8) Isolates. The first 5 of these variables are centrality measures at the graph level (which for egocentric / personal data collection means the ego / respondent level). Components, Dyads, and Isolates are counts of sub graphs within the network that are disconnected from the rest of the network. Dyads are components of size 2 (2 nodes connected to each other) and isolates are components of size 1 (a node with no other connections). Each of the variables named “Max” is based on an alter level network statistic (see below). The Max value is simply the highest value among the alters.
Alter Level Raw Data:
The columns that follow the ego level system / raw / statistics data are at the alter level. Unlike the ego level variable, each row represents a different set of variables, each representing one alter named in one Egoweb interview. The first alter level variable column is “Alter Number” followed by “Alter Name”. The “Alter Name” is the text that was entered in one of the name generators. Each name is assigned a number from 1 to n where n = total number of alters named in that interview. These numbers are provided in the “Alter Number” column. The alters are numbered based on the order in which they were entered.
After the alter number / name columns are columns named according to the names of the name generators used in the interview. There will be a column for each name generator. Although the name generators can be spread throughout the interview, separated by ego/alter/network questions, they will appear together in the ego-alter .csv file together, immediately after the alter number/name columns. If an alter was named in a name generator, it will have a “1” in the name generator column and a “0” otherwise.
After the name generator columns, the ego alter .csv file will have the raw responses to the alter level questions. The data are exported with the same procedures as the ego level data are exported (see above).
Alter Level Network Statistics:
In addition to ego level (network / graph level) statistics, EgoWeb also exports network statistics at the alter / node level. Alter level network statistics columns are located after the raw alter level variable columns. Similar to ego level statistics, the alter level network statistics that are exported are based on the edge expression selected prior to data export. EgoWeb exports 3 alter level network statistics variables: 1) Degree Centrality, 2) Betweenness Centrality, and 3) Eigenvector Centrality. (For more details about degree centrality and betweenness centrality in personal networks, see McCarty 2002. Eigenvector centrality is explained in detail here. Each of these measures are explained is most articles and books giving details about measurement of social networks.)
Alter Pair Data:
The Alter Pair .csv file export organizes the data export as an edge list (or edge array if more than one alter pair question has been asked). Looking at the file from left to right in an Excel spreadsheet, the order of the variables is 1) system variables (same as the Ego Alter data export), 2) alter number for the first alter in the edge, 3) name of the first alter in the edge, 4) alter number for the second alter in the edge, 5) name of the second alter in the edge, 6) alter-pair question raw responses.
Currently EgoWeb 2.0 only exports symmetric data only. Therefore, only unique alter-alter edges are exported. The Alter numbers in the alter-pair data are the same as the alter numbers exported in the Ego Alter data. Alters are numbered consecutively in the order they were named, from 1 to # number of alters named.
Data that has been collected with EgoWeb 2.0 can be reviewed, displayed, and modified in Data Processing. Next to each response row in Data Processing, there are three buttons: Edit, Review, and Visualize.
Visualizing Data opens a screen with the one response from one interview. There are a variety of settings for the visualization of the network. The options are almost the same as the settings to visualize a network with a “Network” question type. The first step is to define how the network edges will be drawn (i.e. how to dichotomize an alter-pair question to define the edge to be used in the visualization.) EgoWeb 2.0 uses a spring embedding algorithm to configure the nodes.
Nodes can be assigned color, size, shape based on alter questions. Size and color can also be displayed based on the node's centrality, which is defined based on the edge expression. After changing settings, click “Refresh” to change the network based on new settings.
The network can be displayed various ways, either on it's own screen or on a new tab that is optimized for printing. Labels can be toggled on and off to assist printing / display while maintaining anonymity of the respondent. The visualization algorithm can also be re-run to change how the nodes were placed. Nodes can also be moved around the screen manually using a mouse.