When you first click on the "Networks" button in the header you will see a list of all your networks, including those networks that are owned by other users but which you have been granted access to (if any). By default, your networks will be sorted by the date that they were last modified with the most recently modified networks appearing first. You can change the way that the networks are sorted by clicking on any of the Network Name, Access or Modified column headings.
On the page with a list of your networks you will see a button at the bottom of the table labeled "New Network". To upload a network you can either click on this button or the "Create" button in the header.
A dialogue window will open, with four inputs:
- Network Name: Enter the name of the network you are uploading here. This name will allow you to distinguish this network from your other networks.
- File Type: Select the type of file that you plan to upload. Polinode accepts three different file types - Excel, GEXF and JSON. Excel is the most common file format and we provide some further explanation of how to format your data in Excel below. GEXF is a specialized XML-type file-format that is sometimes used to store network data. It allows you to easily import network data from other specialized packages. JSON is a Polinode specific format that is mainly used if you want to interface with our API. However, you can upload JSON as a flat file in the same format here. You can learn more about our API and the format of JSON that it accepts here.
- File: Click the Choose File button and then select the file that you would like to upload. You will also see a "Download template" button here - a template is available for each of the three file types - Excel, GEXF and JSON - and this button will change depending on what file type you have selected. Click this button in order to see a simple example of the format you have selected.
- Access: When you upload a network to Polinode you can choose whether this network is a Public or Private network. Public networks can be accessed by anyone, whereas for private networks you determine exactly who can access your network (if anyone). So, if you would like others to be able to easily access your network via a URL and don't mind the data being publicly available you may want to select "Public". But if your data is private and you want to control access to your network then you will want to select "Private".
By default all Partner and Enterprise users do not have the ability to upload Public networks for privacy and security reasons. This setting can be changed on request.
If you click on "Download Excel template" and open that file up in Excel you will see a simple example Excel file that contains illustrative data in the Excel format that Polinode accepts. Let's take a look at this file in a bit of detail. You will see that there are two worksheets in the file - Nodes and Edges. The Nodes worksheet contains one compulsory column - Name. This column contains the names of the nodes in your network and each name must be unique as it also serves as the unique id of the node.
Next to the "Name" column you will see two example attribute columns - "Example Categorical Attribute Name" and "Example Numerical Attribute Name". These columns are in the template just to provide an example of uploading node attributes. Attribute columns are optional - you don't have to include any attributes. But you can think of attributes as being like metadata on the nodes. For example, if your nodes represent people you can include demographic attributes such as Gender, Age, Salary, etc. Attributes can be either text (i.e. categorical attributes) or numbers (i.e. numerical attributes) and you don't need to worry about letting Polinode know what type of your data is - it will work it out automatically. Also, the names of the attributes in the network are determined by the column heading in this file. So, if you are using this template file you will want to rename the column headings. For example, use say "Age" instead of "Example Numerical Attribute Name".
The order of the columns in the Excel file does not matter. For example, "Name" does not need to be the first column. Generally though it's preferable for "Name" to be the first column.
Generally speaking you can name your attributes anything at all. However, there are a few reserved column names that have a special interpretation in Polinode and so you should only use them for their intended purpose. The first of these is "Name", which we have already mentioned above. The others are detailed below (and are also covered in the comment in cell B1 of the Excel template file):
- Imageurl: If you would like to include an image within a node, you can add a url to that image by using this reserved attribute. Many popular services for images such as LinkedIn and Twitter will just work, i.e. you can enter the URL directly such as https://pbs.twimg.com/profile_images/503093797194973184/16HP_Omb.jpeg. These services automatically add an Access-Control-Allow-Origin header to their response. However, if you use images from a URL that does not add this header you will need to prepend a third-party proxy such as https://crossorigin.me to the URL. For example, you may enter something like: https://crossorigin.me/http://yourURL.com/yourImageName.jpg. Alternatively, more advanced users may want to upload images to a CDN such as Cloudfront and use the URLs from that CDN, being sure to enable CORS for polinode.com.
- Imagescale: If you have set an imageurl for a node you can increase or decrease the size of the image using this input. This setting is particularly useful when you have a square image from LinkedIn or Twitter and would like to expand it to fill a circular node. The default value for Imagescale is 1.
- Imageclip: If you have set an imageurl for a node you can remove a portion of the image (from the edge of the image). This is useful when you would like a border around the image after expanding the image using Imagescale. The default value for Imageclip is 1.
- Icontype: It's possible to add an icon to a node using this reserved column name. Simply enter the Unicode identifier in this column for a Font Awesome icon. For a full list of available icons please see this link: https://fontawesome.com/v4.7.0/icons/. For example, to display an anchor you would enter f13d which you can find by clicking on the anchor icon on this page: https://fontawesome.com/v4.7.0/icon/anchor.
- Iconscale: If you would like to increase or decrease the size of the icon that you have specified via Icontype you can do so with this reserved column. The default value for Iconscale is 1.
- Iconcolor: If you would like to specify a color for the icon you can do so via this reserved column. The default value is white, i.e. rgba(255, 255, 255, 1).
- Nodetype: It's possible to specify different shapes for nodes using the reserved Nodetype column. The supported shapes and values for this column are circle, star, triangle, diamond and square. The default value for Nodetype is circle.
All reserved column names are case insensitive. For example, it doesn't matter whether you use "Name" or "name" or "imageurl" or "imageURL".
There are a few more reserved column names than the above - "Label", "Color", "Hidden", "Size", "Id", "x" and "y" are also reserved column names. Generally though it's better not to make use of these reserved column names as you can interactively set labels, node colors, sizes and hide nodes as well as position them within the application.
If you add an attribute that is not a reserved column name (e.g. Gender), you can optionally add one of three prefixes in front of that attribute name:
- _(e.g. _Gender): Any attribute names that are prefaced with an underscore ("_") will be uploaded and available in the explore view but those attributes will be hidden in the left-hand-side explore window. This is useful if you want to say size or color by these attributes but don't want them visible to users as they explore the network.
- *(e.g. *Year): Any attribute names that are prefaced with an asterisk ("*") will be uploaded but interpreted as a categorical (i.e. text) attribute regardless of whether the data is numerical or not. For example, you may upload a column "*Year" and the years in that column will be processed as text rather than numbers.
- #(e.g. #Email): Any attribute names that are prefaced with a hash ("#") will be uploaded but not included as attributes that you can size by, color by, filter by. This is most useful when you have a text attribute that takes on many different values.
If you have a large network (thousands of nodes say) with many attributes one way that you can decrease the load time of that network is to prepend a "#" in front of the names of categorical attributes that take a large number of different values or where there is a unique value for each node. Each time this network is opened this attribute will no longer be parsed and all the unique values enumerated. The attribute values will still be available in the left-hand-side menu though on hovering over or clicking a node.
The second worksheet that you will see in the Excel template file is the Edges worksheet. In the Edges worksheet there are two compulsory columns - Source and Target. The Source column specifies the origin of an edge (i.e. which node it starts at) whereas the Target column indicates which node that edge goes to (i.e. the node it terminates at).
You will receive an error message if you specify a value in the Source or Target column that is not present in the Name column in the Nodes worksheet. Using the MATCH() function in Excel can quickly tell you which values for those columns are not present in the Name column.
Just as for the Nodes worksheet, you can also specify as many attributes as you need for each edge in the worksheet. You will find that the template Excel file contains two example attributes for edges - "Example Categorical Attribute Name" and "Example Numerical Attribute Name" which you can rename as needed and add additional columns as required.
There are some reserved column names for edges just as there are for nodes and these reserved names are "Weight", "id", "Color", "Hidden" and "Label". Generally it's better to use the interactive sizing, filtering and coloring functionality in the application rather than using these reserved columns.
The same prefixes that are available for node attributes can also be used for edge attributes, i.e. you can prepend any of "_", "*" or "#" with the same result as detailed above for nodes.
Returning now to the Create New Network dialogue, you will see that there is a button at the bottom of the dialogue called "Advanced Options". If you click on this button you will see one more input - it gives the ability to specify whether a network is a directed network or not. By default all networks are directed in Polinode but if your data is undirected you should select No for Directed here. The difference between a directed and undirected network is that in a directed network the order of the edges has meaning, that is to say that we differentiate between the source and target for an edge. In an undirected network, the relationship exists but it doesn't have a direction. A helpful example is the difference between relationships in Twitter versus Facebook. In Twitter, you can follow another user but that user doesn't necessarily follow you. That is an example of a directed network. In Facebook though if you are friends with another user then that user is also friends with you - that is an example of an undirected network.
The final step in creating a new network is to click the Create button at the bottom right of the dialogue. Once clicked, your data will be checked and, if valid, uploaded. The upload can take anywhere from a few seconds for a small network through to a few minutes for a large, complicated network.
Interacting with a Network
To start interacting with a network simply click the "Explore" button for that network. You will then be taken to the explore view for that network. If no x and y positions have been specified for a network the force directed layout algorithm will automatically start running when the network is opened.
The force directed layout algorithm is a continuous algorithm that reaches an equilibrium after it has been run for a while. On first opening a network it is just run for a few seconds, which is generally not long enough for it to reach an equilibrium. In most cases you should navigate to layouts and continue to run the force directed layout until the nodes come close to stopping (i.e. reach an equilibrium) and then press the stop button under layout.
You can interact with your network in all the ways that you would expect. For example, if you left click on a node you can move it around. You can also use the scroll wheel on your mouse to zoom in and out and if you left click on the background (i.e. anywhere that is not over a node) you can move the network itself around. Furthermore, if you hover over a node the window on the right-hand-side of the screen will display all the attributes for that node. And if you left click a node that node as well as all the nodes it is connected to will be selected and the attributes for that node will be displayed in the right-hand-side window. To quickly clear such a selection you can either right-click anywhere on the background of the network or you can click on the cross towards the top of the right-hand-side window.
If there is a description for the network that you are exploring then that description will appear in the right-hand-side window when you first open the explore view. By default networks do not have a description so nothing will show but a description can be set under Settings for a network. The description can also be hidden by clicking on the cross towards the top of the right-hand-side window. Once the description has been hidden, you can show it again by clicking on the information icon towards the top of the right-hand-side window.
Searching for Nodes
At the top of the right-hand-side window you will find a search input where you can start typing the name of a node and select from the auto-completed options shown. By default, you are able to search nodes by their name. However, if you change the attribute that is used for displaying labels (click here for details) then that attribute will be used in the search input as well.
Selecting a node via the search input works in exactly the same way as selecting a node by left-clicking on it, i.e. that node and all of the nodes that it is connected to will be selected. It is also possible to cumulatively add nodes to the selection by using the search and select input. For example, if you select Node A, then you can add Node B and all of Node B's neighbors to the selection by typing Node B's name into the search box and selecting it. This works the same as holding down the control (or command key for Apple devices) key while left clicking nodes but has the advantage of being able to progressively search for nodes to add by their name or another attribute.
We will now turn our attention to the left-hand-side menu in the explore view and step through what each of the icons in this menu does and how to use it.
If you are new to Polinode you may want to work with the left-hand menu expanded to see the text descriptions for each icon. To do this, click the expand button at the very bottom of the left-hand-side menu. You can always click this button again to collapse this menu back to its abbreviated form.
The first icon in the left-hand-side menu is Layout. Clicking on this icon will open a menu at the bottom of the page. By default the force directed layout will be selected. In total there are six options for layout, all of which allow you to control or impact the position of nodes:
- Force Directed: This layout algorithm simulates physical forces on the network. You can think of it as applying an attractive force between nodes that are connected by an edge and simultaneously applying a repulsive force between all pairs of nodes. It is a continuous algorithm that will reach an equilibrium when these forces are in balance and the nodes stop moving. It needs to be started and stopped manually by clicking the start and stop buttons. There is also one advanced setting for it and that is the Prevent Overlap option. By default this option is No and you should always start running the layout algorithm with prevent overlap on No. Once the nodes have reached equilibrium you may want to "tidy up" the layout by running the force directed layout again with Prevent Overlap set to Yes. The same physical forces will be simulated but in this case the relative size of nodes will be taken into account so that overlapping nodes are repelled from each other. It is significantly slower to run the algorithm with Prevent Overlap set to Yes which is why it should only be applied after an initial equilibrium has been reached.
- Lens: The Lens layout is very similar to the Force Directed layout in that attractive and repulsive forces are applied. However, the forces that are applied are slightly different and result in a more circular layout, i.e. a lens-like layout. This can be helpful if your network has a large number of isolates or disconnected parts of the network that you would still like to include.
- Distribute Nodes: Distribute Nodes is not really a traditional layout algorithm but is generally applied when you want to add some space between nodes or reduce the overlap between nodes that have been positioned by another layout algorithm. It will iterate through all the nodes in the network and will repel nodes that are close to each other. This is different to running the Force Directed layout algorithm with No Overlap set to Yes because the latter will apply both attractive and repulsive forces whereas the Distribute Nodes algorithm will only apply a repulsive force and only between relatively close nodes. There are a number of settings available for the Distribute Nodes algorithm which you can access by clicking on "Advanced". The first is Margin, which is a circular Margin added around each node when determining whether a node should be repelled from each other. The second is Scale Factor, which is similar to Margin in that an area is added to each node but the difference is that the size of that area is proportionate to the size of each node whereas Margin is a fixed amount for each node. If Margin is set to 0 and Scale Factor is equal to 1 then only nodes that are actually touching will be repelled from each other. The Grid Size setting can generally be left at 25 - it determines the degree of approximation for the algorithm. The algorithm is applied in an iterative fashion, so it is applied once to measure overlap and then repulsive forces are applied as necessary and then it is applied again and so on until there is no more overlap (adding the Margin and Scale Factor of course). Max Iterations determines the maximum number of iterations that the algorithm will run through before stopping so, if you have a large network with a lot of overlapping nodes, you may want to set Max Iterations to a very low number like 2 to begin with otherwise you may find that the algorithm takes a very long time. The final input is Speed and this determines the size of the repulsive force that is applied at each iteration.
- Hierarchical: The Hierarchical layout will attempt to lay a network out based on layers in an optimal fashion, i.e. it will give you an organization chart type layout. There are five advanced settings for the Hierarchical layout. The first two - Direction and Alignment - relate to how the layers that the algorithm determines are appropriate are presented. Node Separation determines the horizontal space between nodes. Edge separation determines the horizontal separation between edges. Finally, Rank Separation determines the distance between the layers that algorithm determines are appropriate. Typically you will need to experiment with these parameters in order to find a hierarchical layout that works well for your data.
- Plot Nodes: The Plot Nodes option allows you to move from a network-type layout to a Cartesian-type layout. That is to say that you can select any attributes in your network and plot the nodes in the network by those attributes on x and y axes. This can include any attributes that have been added to the network as a result of calculating metrics (e.g. In Degree). The key inputs for this option are "Horizontal Attribute" and "Vertical Attribute". You can specify both horizontal and vertical attributes but don't need to; if you specify one only then a bar or column chart will be the result for categorical attributes. Once you have chosen the attribute(s) to use there are a number of additional options available under Advanced. You can use Horizontal and/or Vertical Adjustments to use the log of an attribute rather than the actual value of that attribute. You can also specify horizontal min and max values and vertical min and max values. These options are useful when you have extreme values or outliers and are of course only relevant for numerical attributes. The final option is how to order categorical attributes. There are two options here - alphabetical and count. By default categorical attributes will be ordered by count. That means that they will be plotted in descending order of count, i.e. the number of nodes that fall in each category. Since the legend is always shown in alphabetical order sometimes you may wish to plot these categories alphabetically rather than by count and this is why the alphabetical option exists here.
- Reposition Isolates: Often when working with network data you will find that some nodes are not connected to any other nodes (isolates) and/or that there are parts of the network that are completely disconnected from other parts of the network (connected components). It's often desirable to examine a network without these isolates or disconnected components and that is what the Reposition Isolates option allows you to do. After running a layout like the Force Directed Layout you can run Reposition Isolates and all isolates and disconnected components will be moved to the bottom of the network and arranged in a grid. The dimensions of that grid and vertical space before it starts are determined by the three options that you will find under Advanced Options - Nodes per Row, Row Height and Vertical Space.
In Polinode, layout algorithms are applied to the visible network only. So, if you filter nodes and edges, you can then re-run a layout algorithm in order to run it on the visible network only. For example, only the female nodes.
The position of nodes can be reset at any time by clicking on the Reset button. This will position nodes in an equally spaced circle and you can then re-run a layout algorithm after resetting the positions of nodes.
In Polinode, it's possible to save up to 50 views per network (on the Professional Plan or above). A view in Polinode is a set of settings for the network, essentially each view preserves the following:
- The position of nodes, which you can set using one or more of the options under Layout.
- The Layers for the view, which you can think of as like a set of operations to be applied and the order in which those operations are to be applied (for example size nodes by an attribute then color nodes by another attribute).
- Visual settings for the view such as the background color, edge type, color of nodes, etc.
To open a saved view in Polinode you simply click on the Open button in the left-hand-side menu and then select the view that you would like to open. That view will then be applied, including the positions of nodes, the saved layers and the visual settings.
Within the open views dialogue that will open you can also make any saved view the default view for the network and you can delete any previously saved view. The default view for a network will open automatically when the network is first opened in the explore view.
When a view is open for the network you will see that the Open button will turn blue. This helps you see quickly whether a view is open or not and is true not just of the open button but also most other buttons in the left-hand-side menu. For example, if a filter is active that button will be blue and if a roll-up operation has been applied then that button will be blue.
When a view is open for the network you will see the name of that view with a small cross next to it when you click the open button. In order to clear that view without opening another view you can click on that cross button and the network will revert to a state without any views open.
In order to save a view for a network, click on the Save button and a dialogue will open with two tabs. Use the first tab to save a new view for the network by specifying a name for the view that you are saving as well as whether the view should be treated as a default view or not. If the network already has a default view and you select "Yes" here then the previous view will no longer be treated as the default view and will be replaced with the new view that you are saving.
The second tab in this dialogue gives you the ability to update an existing view. If you have a view open you can use this dialogue to update that view. Whether you are saving or updating a view, once you confirm that you are ready to save the view the node positions, layers and visual settings will be preserved and can be accessed later by clicking the Open button above.
Standard views are available at the network level, i.e. each one is associated with a specific network. However, it is possible to save and apply what are called Template Views. A template view can include a set of layers to be applied and/or a set of visual settings (such as node colors, background color, etc). It doesn't include node positions however and the reason for that is that template views are designed to be applied across networks and sit at the user level rather than against a specific network. So, you can use a Template View to for example save a particular color theme that you can apply with a single click to any network.
In order to access template views simply click on Templates in the left-hand-side menu. This will bring up a dialogue with three tabs. The first is a list of all available template views. All accounts in Polinode have two default template views available and they are Dark Theme and Light Theme. So, if you would like to apply a Light Theme rather than the default Dark Theme you can click the apply button next to the Light Theme template view. You can also delete any existing template views from this table.
The second tab in the dialogue is used to save a new template view - the only input required is a name for the template view and once saved that template view will be available for future use and will capture the layers and visual settings that you see at the time you save it.
The third tab in the dialogue is used in order to update an existing template view. To use it you simply select the template view that you would like to update and then click the Update Template button, after which the template you selected will be updated with the currently visible visual settings and layers.
If you are using Polinode for large networks (typically >10,000 nodes or edges) you may want to activate lightning mode for your network by clicking on Lightning in the left-hand-side menu. This will toggle lightning mode on and it can be clicked again to toggle lightning mode off. Lightning mode will hide the rendering of edges in your network while you interact with it, which makes it faster to interact with. Once you are ready to see the edges again you can always toggle lightning mode off.
The Labels option in the left-hand-side menu gives you a lot of control over the appearance of node labels. By default, labels are only shown once you select a node and then they are only shown for that node and the nodes that it is connected to. You can change this though by changing the Show Labels setting to either Always or When Toggled Only. If this setting is set to Always then labels will be shown for all visible nodes. For smaller networks this may be desirable but for most networks showing labels for all nodes will fill the screen with labels. However, you can easily reduce the number of labels that are shown though by clicking Filter and then clicking on the Labels tab before filtering labels by an attribute, for example the same attribute that you are sizing nodes by if you are sizing nodes by an attribute. The other option for Show Labels is When Toggled Only. If this option is selected then labels will not be shown even after selection but can be toggled on or off for all visible nodes by pressing the "n" key on your keyboard.
There are a total of six other settings for labels in the bottom window:
- Show Background: Your node labels will have a light blue background color by default but you can remove this background color altogether by selecting No for this option.
- Label Text: To change the color of the label text for your labels you can use this input.
- Attribute: By default node names will be used for node labels, however, you can select any text attribute for nodes to use for the label text here.
- Label Background: If Show Background is set to Yes then your node labels will have a background color and you can customize that background color with this input.
- Style: By default your node labels will be Bold but the style of the text can be changed to unbolded (i.e. regular) or you can add italics to the labels by selecting the italics options or the bold italics option.
- Label Position: By default node labels will appear to the right of each node but this can be changed so that node labels appear to the left, top, bottom or in the center of nodes. There is also an option "Center if Space" that, if selected, will ensure that node labels appear in the center of nodes where there is enough space to fit those labels and will appear to the right of the nodes where there is not enough space.
In Polinode you can add and then preserve a series of operations that you want to apply to the network and these operations are saved as layers. You can think of layers in Polinode as being like layers in Photoshop. The important point is that the order that you are applying those operations is preserved. So, for example, you can first calculate a metric like In Degree and then size nodes by In Degree.
There are a total of 11 different types of layers in Polinode:
- Calculating a metric (either a node or edge metric)
- Coloring nodes by an attribute
- Coloring edges by an attribute
- Coloring labels by an attribute
- Sizing nodes by an attribute
- Sizing edges by an attribute
- Sizing labels by an attribute
- Filtering nodes by an attribute
- Filtering edges by an attribute
- Filtering labels by an attribute
- Rolling nodes up by an attribute
You will notice that the five options in the left-hand-side menu immediately below the Layers option relate to these different types of layers, i.e. they are Metrics, Color, Size, Filter and Roll-up.
If you would like to view a summary of all the layers that are currently applied you can click on this Layers button in the left-hand-side menu and a summary of the layers will then appear in the right-hand-side window. You can edit any of these layers at any time by clicking on it and you can also remove a layer by clicking on the delete icon for that layer.
Typically an important part of network analysis is the calculation of metrics based on the network. Polinode provides the ability to calculate 21 different node metrics with a full list available by clicking on the "Advanced Metrics" button after opening the Metrics dialogue. Each of these 21 metrics is summarized below:
- Average Neighbor Degree: Average Neighbor Degree for a node is the average number of edges (i.e. degree) that a node's neighbors have. For directed networks, you can specify whether to use in-degree or out-degree for each of the source and target nodes in the calculation. Read more here.
- Betweenness Centrality: Betweenness Centrality for a node is the total number of shortest paths that pass through that node and, if the Normalized option is selected, divided by the total number of shortest paths in the network. It is a measure of how much a node is a 'bridge' between other nodes in the network. Read more here. Betweenness can be computationally expensive to calculate, particularly for large networks, which is why the option to sample a subset of nodes is provided as an input. If Apply Edge Weights is set to Yes then the inverse of the edge weights will be used such that a larger edge weight effectively reduces the distance between two nodes rather than increasing it.
- Binary Flag: Binary Flag is a helper metric that is equal to True for a node if for the selected attribute below is equal to one of the selected values below for that node. Otherwise, it is equal to False. It is particularly helpful when used together with the External edge metric.
- Brokerage: Brokerage here refers to Gould-Fernandez brokerage. Given an attribute, there are five kinds of brokerage possible: Coordinator, Consultant, Gatekeeper, Representative and Liaison. This metric will count up and return the number of times that a node acted in each of those roles. Read more here.
- Closeness Centrality: Closeness Centrality for a node is the reciprocal of its farness. The farness of a node is the sum of its shortest path distances from all other nodes. The greater a node's Closeness Centrality relative to other nodes, the closer it is on average to other nodes in the network. Read more here.
- Clustering: The Clustering coefficient for a node is the fraction of possible triangles through that node that actually exist. The higher a node's clustering coefficient, the more embedded it is in the overall network. Read more here.
- Connected Components: A Connected Component is a set of nodes that are connected to each other. A directed network will be treated as an undirected network for the calculation of Connected Components. Read more here.
- Constraint: Constraint is related to the concept of structural holes and measures the extent to which a node is able to take advantage of structural holes in their network. Constraint will be higher if a node's connections are highly connected between each other, either directly or indirectly through a mutual connection. Read more here.
- Core Number: Core Number for a node is the largest value k of all k-cores containing that node where a k-core is the largest possible subgraph in the network containing nodes with a Total Degree of k or more. Core Numbers can be helpful in the decomposition of large networks. Read more here.
- Current Flow Closeness Centrality: Current Flow Closeness Centrality is similar to regular Closeness Centrality but instead of a shortest path measure for distance, effective resistance inspired by electric circuit models is used. Read more here.
- Effective Size: Effective Size is related to the concept of structural holes and the reduncancy of connections. It measures the number of people that the node is connected to but controlling for (i.e. reducing by) the redundancy of those connections. Read more here.
- Efficiency: Efficiency is equal to effective size divided by total degree. If a node has no redundant ties then the effective size will be equal to total degree and efficiency will be equal to one. Efficiency is the proportion of a node's ties that are non-redundant. Read more here.
- Eigenvector Centrality: Eigenvector Centrality is motivated by the idea that nodes connected to other nodes that are central should themselves be relatively central, i.e. being connected to a central node contributes more than being connected to a non-central node. It is not always well defined for directed networks and it's generally preferable to calculate Katz Centrality for directed networks. However, should you calculate eigenvector centrality for a directed network Polinode will return "left" eigenvector centrality (i.e. corresponding to the in-edges). Read more here.
- External vs Internal: External vs Internal (EI) calculates, for a given attribute, the percentage of a node's edges that connect to nodes that do not share the same value for that attribute (external connections) vs connections to nodes that do share the same attribute value (internal connections). If type is Total the metric will be calculated for all edges, if type is In then the metric will be calculated only for a node's incoming edges and if type is Out then only for a node's outgoing edges.
- Harmonic Centrality: Harmonic Centrality for a node is the sum of the reciprocals of the shortest path distances from that node to each other node in the network. It is closely related to Closeness Centrality with the key difference being that the reciprocal is taken for each distance rather than taking the reciprocal of the sums of the distances. Read more here.
- HITS: Hyperlink-Induced Topic Search (HITS) for a node gives two metrics - Hubs and Authorities. A node has a relatively high Hubs score if it links to other nodes and a relatively high Authority score if it is linked to by other nodes. Read more here.
- Identify Influencers: Identify Influencers is a heuristic that finds the most influential nodes in the network in the sense that together the count of those nodes and the nodes connected to those nodes is maximized, i.e. coverage of the network is maximized. Read more here. It is also possible to limit the influencers identified to certain attribute values by using the Limit by Attribute option. This is helpful if, for example, you want to identify influencers in an organization but only at the individual contributor level.
- In Degree: In Degree for a node is a straightforward measure of centrality - it measures the total number of nodes linking to that node. Read more here.
- K Clique Communities: A K Clique Community is the union of all cliques of size k that can be reached through adjacent k-cliques where a k-clique is a group of k nodes that are all connected to each other and a k-clique is said to be adjacent to another k-clique if it shares k-1 nodes with it. Communities produced by this algorithm are generally not distinct and will overlap so an attribute is added for each community found. Read more here.
- Katz Centrality: Katz Centrality for a node takes into account not just the neighbors of that node but also their neighbors and so on, applying an attenuation factor of alpha so that the influence of nodes declines on every step away from the target node. Read more here.
- Load Centrality: Load Centrality for a node is the total amount of some commodity passing through that node when one unit of the commodity is sent from each node in the network to each other node in the network and the commodity is split equally at branching points and aggregated at meeting points.. Load Centrality is very similar to Betweenness Centrality and also measures 'bridging'. Read more here.
- Louvain Communities: Louvain Communities are non-overlapping groups of relatively closely connected nodes found by an optimization algorithm. Read more here.
- Out Degree: Out Degree for a node is a straightforward measure of centrality - it measures the total number of nodes that that node links to. Read more here.
- Pagerank: Pagerank for a node is a ranking of relative importance in the network based on the structure of incoming edges for that node. It was originally designed to rank web pages. Similar to Katz Centrality, alpha is an attenuation factor. Pagerank for undirected networks will be calculated by transforming each undirected edge into two directed edges. Read more here.
- Total Degree: Total Degree for a node is a straightforward measure of centrality. It is simply the total number of edges that that node has, i.e. for directed networks it is the sum of In Degree and Out Degree. Read more here.
Some of these metrics are only available for directed networks and some are only available for undirected networks. If a metric is not available for your network you will see a message to that effect to the right of the dialogue.
There is also an option to view Basic Metrics only which are - Communities (18 above), External vs Internal (10 above), Identify Influencers (13 above), In Degree (14 above), Out Degree (19 above) and Total Degree (21 above). If you are new to network analysis you may want to focus just on these basic metrics to begin with and it's worth noting that you can generally gain a great deal of valuable insights using just these basic metrics.
To calculate a metric simply select it and then input any options (if relevant) and click on "Calculate and Add as Layer". The metric will then be calculated with the result added as an attribute (or attributes) for all nodes. Importantly the metric will only be calculated for the visible network, i.e. after any filters have been applied. This means that you can use Polinode to dynamically calculate metrics for sub-networks. For example, you could filter a network by gender to focus just on say female nodes and then, if you were to calculate In Degree the In Degree calculated will be for the female network only.
When you add a metric it doesn't alter the underlying network data but rather a layer is added so that the metric is recalculated each time a view is opened. For larger networks some metrics can take a while to compute. If this is the case for you, you can simply export your network to Excel after calculating the metric and then reimport that Excel file to update the network. The metric will then be added as an attribute and won't need to be recalculated each time a view is opened.
The second tab in the metrics dialogue provides three advanced edge metrics (two if Basic Metrics is selected):
- Edge Betweenness: Edge betweenness is a measure of the total number of shortest path in the network that pass through an edge relative to the total number of shortest paths in the network overall. Just as for Node Betweenness, you can select a number of nodes to sample as a percentage of the total nodes in the network. If Apply Edge Weights is set to Yes then the inverse of the edge weights will be used such that a larger edge weight effectively reduces the distance between two nodes rather than increasing it.
- External: For a given attribute, the External metric is True for an edge if that edge connects two nodes with different values of that attribute and False if that edge connects two nodes with the same value for that attribute. For example, if the attribute selected is Gender and an edge selects two males then External will be False for that edge whereas if the edge connects a male and a female then External will be True for that attribute.
- Mutual: The Mutual metric is True for an edge if a reciprocal edge also exists, i.e. for an edge from a source node to a target node the metric is true if there is also an edge from the target node to the source node.
Clicking on the color option in the left-hand-side window will open a window at the bottom of the page with three options - Nodes, Edges and Labels. It is possible using these tabs to color any of nodes, edges or labels by any attribute. Nodes and labels can be colored by any node attribute and edges can be colored by any edge attribute. To select an attribute for nodes for example, simply select the name of the attribute from the "Color Nodes By" input. If this attribute is a categorical attribute then each value for the attribute will be colored a different color and you can edit these colors by either clicking on the "Edit Colors" button or the names of these attributes in the legend that will appear. Alternatively, if the attribute you select is a numerical attribute then the maximum value of the attribute will be colored one color and the minimum value of the attribute another color with those two colors interpolated for all values in between. These max and min colors can also be adjusted by clicking on either Edit Colors on the legend itself. Where the attribute is a numerical attribute a Thresholds slider will also appear. With this slider you can specify where you want the breakpoints to be for the max and min values. For example, if your minimum value is 1 and you select 3 for the minimum threshold then both the node with the value of 1 will be colored the minimum color as well as any nodes with attribute values that fall between 1 and 3.
Coloring edges and labels works in the same way as described above for coloring nodes except there are some additional options for coloring labels. By default all labels are the same color and the Color Labels Option is set to Custom Color, which you can change in this input. If you would like to color nodes by an attribute instead you will need to select "Attribute" for Color Labels Option. There is one other option available for coloring labels and that is to color them by node color - if that option is selected then all labels for nodes will be colored to match the color of the nodes that they relate to.
Once you have colored nodes, edges or labels by any attribute you can add that operation as a layer by clicking the "Add as Layer" button or you can clear it by clicking the "Clear" button.
If you click on a number that sits either above or below a slider thumb you can edit that number in order to set an exact value for it. Simply hit the enter key once you have finished editing the number.
Clicking on the size option in the left-hand-side window will open a window at the bottom of the page with three options - Nodes, Edges and Labels. You can size nodes, edges or labels by any numerical attribute using the inputs here. For example, to size nodes you would click on the "Size Nodes By" input and will be presented with a list of numerical attributes for nodes. Select one of these attributes and nodes will be sized by it. You will likely also want to adjust the maximum and minimum size for nodes directly above this input. And to the right of the selected attribute you will find a Thresholds input which allows you to select the attribute value to apply the minimum size below and the maximum size above.
Once you have sized the nodes, you can add the operation as a layer by clicking the "Add as Layer" button or you can reset sizing by clicking the Clear button. The approach for sizing edges and labels is exactly the same as for sizing nodes.
If you want to apply edge weights when calculating metrics that support using edge weights you will first want to size edges by an attribute using this input and add that operation as a layer.
Filtering works in a similar way to coloring and sizing - simply click on the filter option in the left-hand-side window and a new window will open at the bottom of the page with three tabs - Nodes, Edges and Labels. Let's look at filtering nodes. The first thing that you will want to do is select an attribute to filter the nodes by using the "Filter Nodes By" input. You can filter nodes by any attribute, i.e. both categorical attributes and numerical attributes are supported. The input looks a bit different for the two types of attributes though.
If you select a numerical attribute to filter by you will see a Thresholds slider appear and next to that slider you will see a Filter Type input. By default the filter type will be set to Standard. A standard filter works in exactly the way that you would expect it to - all nodes with an attribute value less than the minimum threshold you select will be hidden and all nodes with an attribute value greater than the maximum threshold you select will be hidden.
In most cases the default Standard filter type is what you will want to use. But sometimes you may require more flexibility and that is where the Inverse filter type comes in handy. Just as its name suggests it works in exactly the inverse way to a standard filter - nodes with attribute values between the selected minimum and maximum thresholds will be actively unhidden. So you can use a Standard filter to for example hide all nodes in the network and then add an Inverse filter which unhides specific nodes.
Filters can be stacked on top of each other. So, for example, you can add a filter so that only females are shown and then you can add another filter to show only employees with a tenure of more than five years. Standard filters stacked together like that work like an AND operator. Combining Standard filters with Inverse filters allows you to construct complex AND and OR operations together.
If you select a categorical attribute rather than a numerical attribute, the logic above still applies but the input appears slightly differently. Rather than see a slider you will instead see a list of categorical attributes under the "Selected Attribute Values". By default all of the attribute values will be selected. You can change the attribute values that are selected by either left clicking within the Selected Attribute Values (holding down the control key to select multiple attribute values) or by searching and selecting attribute values using the input to the left of it. You can also quickly deselect all attribute values by clicking the "Deselect All" button and you can flip what is selected by clicking the "Toggle All" button in which case all selected values will be unselected and all unselected values will be selected.
Once you have set up your node filter, you can add it as a layer by clicking on the "Add as Layer" button or you can reset it by clicking "Clear". Edge and Label filters work in exactly the same way as node filters described above.
When you calculate metrics in Polinode, they are calculated on the visible nodes and edges only. This means that you can apply a filter or filters prior to calculating a metric and that metric will be calculated for the sub-network only.
You can roll-up nodes by any categorical attribute. Simply select the categorical attribute that you would like to roll-up nodes by and all the nodes that share the same attribute value for that categorical attribute will be combined together. Nodes in the network will then represent the attribute values and the edges between those rolled-up nodes will represent the aggregated connections between nodes that share the same attribute values.
You will also generally see self-loops after rolling up nodes since nodes will generally be connected to other nodes that share the same attribute values. You can hide these self-loops by clicking on Settings then Edges and setting self-loops to No.
When you roll nodes up, new attributes are calculated for both nodes and edges. The new attributes that are calculated represent the Average and Total values across all numerical attributes for all nodes that share the same attribute values for the rolled-up values. For example, if you roll nodes up by Division and you have previously calculated In Degree then both Average In Degree and Total In Degree will be calculated for each division.
The position and color of rolled up nodes will be set to the average position and average color of the nodes that share the same attribute value.
Rolling up nodes can be preserved as a layer by clicking on the "Add as Layer" button and the roll-up can be cleared by either deleting the layer or clicking the Clear button.
If you export a rolled-up network to Excel you can then re-import that network as a new network. This gives you more flexibility with respect to the rolled-up network, including giving you the ability to save node positions for the rolled-up network.
Clicking on the Reports option in the right-hand-side menu will open up the Network Reports window with three tabs - "Network Summary", "Nodes Table" and "Matrix Reports". The first tab - "Network Summary" - contains a high-level summary of the network including: the name of the network, whether it is directed or undirected, the number of visible nodes and edges, the average total degree of all visible nodes in the network and the density of the visible network. Network density is the number of relationships that do in fact exist as a percentage of the number of potential relationships that could exist in the network.
The second tab is Nodes Table, which provides an Excel-like table for examining nodes and their attribute values. You can select up to five different node attributes to show side-by-side. To change the attributes that are displayed, click the "Choose Columns" button and then select the attributes you would like to see. When you are ready, click the Fix Columns button. You can then sort the attributes in either ascending or descending order by clicking on the column heading for that attribute. The sorted table can be exported to Excel by clicking on the "Export Nodes" button towards the bottom of the window.
The final tab within the Network Reports window is the Matrix Reports tab. There are two matrix reports available - the Collaboration Matrix and the Density Matrix. The Collaboration Matrix shows the interactions between different groups within the network. You will need to select a categorical attribute such as Division and will then see a matrix summarizing the interactions between nodes with the different values of this attribute. The way to read this matrix is that the row headings relate to the source of connections and the column headings to the target (or, put alternatively, the from and to). So, each row adds up to 100% and shows how the relationships that originate from the row heading attribute are split in terms of their destination amongst the different values for the selected attribute. The Collaboration Matrix also includes, as the final row in the matrix, Relative Size. This row also adds up to 100% and shows what percentage of the total visible nodes have each attribute value. You can easily export the collaboration matrix to Excel by clicking the "Export Matrix" at the bottom of the window.
Once the Collaboration Matrix is in Excel, you can create a heatmap by using conditional formatting in Excel.
Within the Matrix Reports tab there is also a second option - Density Matrix. This option is similar to the Collaboration Matrix but each cell in the matrix contains the density of connections for that particular combination of attribute values, i.e. the percentage of connections that do in fact exist relative to the total connections that could potentially exist for that combination of attribute values.
Both Collaboration Matrices and Density Matrices have some inputs that you can set to the right of the attribute selection input. These inputs allow you to filter the displayed cells so as to only show values where they are above or below thresholds that you set. You can also determine the number of decimals to display for each cell.
All three tabs in the Reports window are fully dynamic in the sense that if you apply a filter or set of filters, they will be recomputed for the visible network only.
There are five options for exporting networks within the explore view:
- PNG: This option will allow you to export the network as a PNG image and will include the legend and background - essentially everything that you see.
- SVG: The SVG export is similar to the PNG export in the sense that an image file is the result. However, the SVG file gives you arbitrarily high resolution. The tradeoff though is that some features are not supported, for example, your legend will not be included in an SVG export.
- Excel: This option gives you the ability to export your network to Excel in exactly the same Excel format as you can import networks into Polinode. It will include all node and edge attributes, including any metrics that you have calculated.
- GEXF: Graph Exchange Format (or GEXF for short) is an XML-type format that is designed specifically for network data. You can export your Polinode networks to this format for import into other specialized network packages. This export includes not only the node and edge attributes but also the position and color of nodes / edges.
- GraphML: GraphML is similar to GEXF - it's an alternative specialized network export format that you can use to import your Polinode network into other packages if you choose to do so.
You can export a network image with a transparent background color by first navigating to Settings and then to the Other tab and setting the background color alpha (the fourth number) to zero. Then, if you export to PNG, the image will have a transparent background and you can overlay it over another background in say Powerpoint or even an image of a map.
Export to Excel, GEXF and GraphML is only available for Owners of networks and for users with Edit permission for a network. Users with View permission are not able to export to these formats.
Clicking on Settings within the left-hand-side menu will display the Settings window at the bottom of the page. There are four tabs here - Edges, Legend, Selection and Other. For the most part these settings control the visual appearance of the network.
The first tab - Edges - allows you to control the visual appearance and interaction with edges in the network. There are six settings related to edges altogether:
- Edge Style: Here you can select between curved edges and straight edges. Importantly, for directed networks, straight edges will overlap each other if there is an edge from both the source to the target and from the target to the source. Sometimes though you may want the more simplified view that straight edges afford.
- Edge Hover: To keep things simple, when you hover over an edge in Polinode you don't see the attributes associated with that edge in the right-hand-side window in the same way that you do when you hover over nodes. However, it's easy to turn edge hover on by changing the setting here to Yes.
- Self-Loops: By default a node's connection to itself is shown as a circular loop. Quite often it's preferable to hide these self-loops and you can do so by changing this setting to No.
- Draw Edges: If you would like to hide edges from the network diagram you can do so by changing this setting to No.
- Arrows: By default, edges in directed networks have arrows at the target end of the edge. You can turn these arrows off by selecting No here.
- Arrow Size: If the network is a directed network and arrows are set to show then you can increase or decrease the size of the arrows relative to the thickness of the edges using this slider.
The second tab within Settings is Legend and it is here that you can control the visual appearance of the Legend. You can select the color of the text in the legend as well as hide any of the three legends - nodes, edges or labels. Each of those three legends will appear by default as soon as you color any of nodes, edges or labels.
The third tab is Selection and in this tab you will find settings that control the visual appearance related to your interaction with the network. The first three settings relate to the color of borders, backgrounds and label text when you hover over a node with the mouse. The fourth setting is the color of an edge when you hover over it. The fifth setting allows you to determine how transparent unselected nodes and edges are after you make a selection (e.g. by left-clicking on a node). The final setting determines which edges are shown after you make a selection. By default, all the edges between selected nodes are shown. So, if you select a node and that node is selected to three other nodes say then the edges that will be selected are the edges going into the selected node, the edges going out from it and the edges between all the nodes that are connected to it. That is what the "All Edges" selection gives you. However, you can limit the edges that are shown to be only the incoming edges, only the outgoing edges or only the incoming edges and outgoing edges (i.e. excluding the edges between the neighbor nodes) by changing this setting.
The final tab is Other and contains a few other helpful settings. The first is the background color of the whole network diagram. Then there is a button next to that input that, if clicked, will show a summary of all available keyboard shortcuts within the explore view. There is a second button to the right of it that allows you to rename Communities in the network from "Community 1", "Community 2", etc. to names that you give them. And then there is a third button that will reset the entire view, including removing any layers.
The final button in the Other tab is Advanced Options and, if you click this button, you will see a number of advanced options that are typically not required but are there should you need them:
- Hide Edges on Move: For large networks this will be Yes by default and edges will be hidden when you move the network around and then displayed when you stop moving the network around.
- Delay Sliders: Again, for large networks this will automatically be set to Yes. Basically, it stops the network re-rendering until after you stop moving a slider such as the inputs in sizing nodes.
- Multiple Edges: When inputting network data in Polinode you can input multiple edges between the same source and target pair of nodes and these edges may have different values for the same attribute. By default though these edges will overlap each other and not be rendered separately. This setting allows you to show these edges separately as each edge will have a different curvature.
- Floor and Ceiling Settings: When sizing nodes, edges and labels you may find that you need to increase say the maximum value that the sliders go up to. You can do so here.
- Advanced Force Directed Settings: Speed, Repulsion, Gravity and Mode are all advanced settings for the force directed layout algorithm.
- Zoom Ratio: When using the scroll wheel button on your mouse you will zoom in and out a certain amount. You can increase or decrease the sensitivity of zoom using this setting.
- Rollup Multiple: When applying roll-up the nodes will automatically increase in size after roll-up. You can control how much they increase in size with this setting.
- Rollup Decimals: When calculating the metrics post roll-up (i.e. the Average and Total metrics for numerical attributes) they are calculated to two decimal places by default but that can be changed here.
- Label Threshold: This setting allows you to hide labels for smaller nodes with those labels only being shown after zooming in a certain amount (which is set by this slider).
The final button in the left-hand-side menu is the expand and collapse menu button. By clicking the expand button you will see a larger version of the left-hand-side menu that includes text descriptions of each item in the menu. You can then collapse that menu back down to the smaller version by clicking on the Collapse option.
Using the Legend
Whenever you color nodes, edges or labels a legend will appear or be updated in the top left of the screen. This legend will automatically update to reflect any filters that you have applied or do apply. Clicking on the labels in this legend gives you a quick way of changing the colors of the nodes in the network. This is an alternative to clicking on the "Edit Colors" under Color. The advantage of the latter option though is that you can input an exact rgba value for each color you want to have in the network.
There are a large number of keyboard shortcuts available to you in the explore view. Some of the more advanced actions that you can do with keyboard shortcuts are not available via the menus so it's generally a good idea to familiarise yourself with the shortcuts summarised below:
- Left arrow: Navigate left
- Right arrow: Navigate right
- Down arrow: Navigate down
- Up arrow: Navigate up
- Down arrow + spacebar: Zoom out
- Up arrow + spacebar: Zoom in
- Home: Rotate right
- End: Rotate left
- Left click node: Select node and all connected nodes
- Control + left click node: Add node and all connected nodes to selection
- Right click node: Select node
- Control + right click node: Add / remove node from selection
- Left click edge: Select edge and connected nodes
- Control + left click edge: Add / remove edge and nodes from selection
- Right click outside nodes: Clear all selections
- ?: Show / hide the keyboard shortcut menu
- 1: Add all neighbors to selection
- a: Select all visible nodes
- c: Clear all selected nodes
- e: Export network
- f: Toggle fullscreen
- g: Start / stop layout algorithm
- h: Hide non-selected nodes and edges
- i: Invert selected nodes
- l: Toggle lasso tool
- m: Select marked nodes
- n: Show / hide names
- o: Mark visible nodes as no
- s: Scale network to fit
- u: Unhide
- x: Fix position of selected nodes
- y: Mark visible nodes as yes
- z: Unfix position of selected nodes
- /: Search or unfocus search
- Delete: Unselect in search
Editing a Network
Once you have uploaded a network, if you want to edit that network you don't need to create an entirely new network. Rather, you will find that there is an Edit button next to each network in the list of your networks. Simply click on that button and you can update the underlying network data by selecting a file with your updated network data in it. You can also edit the network name here or change a network from Public to Private or vice versa. Under Advanced Options you can also edit whether the network is directed or not. It's worth noting that you don't need to upload a file each time you edit a network. So, if you just want to change the name of the network for example you can click Edit and change the name of the network only before clicking update.
Managing a Network
Next to each network in your list of networks you will see a Manage button. Clicking on that button will bring you to a page where you can edit the settings for the network and also control user access to each network.
There are two settings that you can edit for each network:
- Description: This is the description that users will see in the right-hand-side menu of the explore view when they first view open the network. Markdown is supported for this description so you can add images, tables, bold, italics, links, etc.
- Image for Discover: If the network is public then you will see a dialogue where you can upload an image for the network. This image will be displayed under Discover, i.e. when a user is scrolling through public networks they will see this image.
This description and image will also be used when you share the network on social media. For example, Twitter will show a card with the image and the first few lines of your description if you share the link to your public network on Twitter.
By default only the Owner of a network has access to that network if the network is private. The Owner of a network is the user who first created that network. However, the Owner of a network can grant permission to other users to access that network. You do this by clicking on the Add User button. A dialogue will appear where you enter the email address of the user that you want to add. Then click Check. Polinode will check whether that user already has a Polinode account and let you know whether they do or not. After you have entered the email of the user you will want to edit the permissions that you would like to grant them. You can grant a user either Edit or View permissions. With Edit permissions they can change the underlying data, save new views, etc. With View permissions only they can just view the network but cannot edit it.
The other thing that you will need to decide is whether the user you are adding will have permission to grant other users access to the network themselves. And whether or not they will have the ability to remove the access of other users (or edit their permissions).
Once you have selected the appropriate permissions, click the Add button and the user will receive an email informing them that they have been granted access to the network. If the user doesn't already have a Polinode account they will receive an invitation to create a Polinode account at the same time.
This Users tab for a network will only be visible to users that have the ability to add or edit users for the network. If a user does not have that ability then instead of seeing the user tab they will see a permissions tab that lets them know what their own permissions are.
It's also possible to add and manage users for a particular network in bulk by using the Upload File button.
Deleting a Network
Deleting a network is as simple as navigating to your list of networks and then clicking the delete button next to the network that you would like to delete. On confirmation the network will be permanently removed.
Discovering Public Networks
In the menu bar at the top of the page you will see a Discover menu item. Clicking that item will take you to a scrollable list of all publicly available networks in Polinode. To begin with you will see only those networks that have been handpicked as being particularly interesting or noteworthy. If you would like to see all public networks, change this option from handpicked to recent and you will then see all public networks in chronological order.