This server has been upgraded to GitLab release 12.9.1.

README.md 6.9 KB
Newer Older
dmattek's avatar
dmattek committed
1 2
# Time-course analysis web-app

Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
3 4 5
## Runnning instance
Access the running instance of the app on [shinyapps.io](https://macdobry.shinyapps.io/tcourse-inspector/ "TimeCourse Inspector")

Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
6 7
## Running the app on the server
The app can be deployed on RStudio/Shiny server. Follow instruction [here](https://shiny.rstudio.com/deploy/ "Shiny - Hosting").
dmattek's avatar
dmattek committed
8

dmattek's avatar
dmattek committed
9
## Running the app locally
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
10
Alternatively, after downloading the code, the app can run within RStudio. Open `server.R` or `ui.R` file, then click "Run App" button with green triangle in the upper right corner of the window with code open.
dmattek's avatar
dmattek committed
11 12 13 14 15

Following packages need to be installed in order to run the app locally:

* shiny
* shinyjs
dmattek's avatar
dmattek committed
16
* shinyBS
dmattek's avatar
dmattek committed
17
* shinycssloaders
dmattek's avatar
dmattek committed
18
* data.table
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
19
* DT
dmattek's avatar
dmattek committed
20 21 22
* ggplot2
* gplots
* plotly
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
23 24
* scales
* grid
dmattek's avatar
dmattek committed
25 26
* dendextend
* RColorBrewer
dmattek's avatar
dmattek committed
27
* ggthemes
dmattek's avatar
dmattek committed
28
* sparcl
dmattek's avatar
dmattek committed
29
* dtw
dmattek's avatar
dmattek committed
30
* factoextra
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
31
* imputeTS
32 33 34
* MASS
* robust
* pracma
35
* Hmisc
dmattek's avatar
dmattek committed
36 37 38

Install packages using `install.packages('name_of_the_package_from_the_list_above')` command in RStudio command line.

dmattek's avatar
dmattek committed
39
```
dmattek's avatar
dmattek committed
40
install.packages(c("shiny", "shinyjs", "shinyBS", "shinycssloaders",
41
					"data.table", "DT",
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
42
					"ggplot2", "gplots", "plotly", "scales", "grid",
dmattek's avatar
dmattek committed
43
					"dendextend", "RColorBrewer", "ggthemes",
dmattek's avatar
dmattek committed
44
					"sparcl", "dtw", "factoextra",
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
45
					"imputeTS", "MASS", "robust", "pracma", "Hmisc")) 
46 47
```

dmattek's avatar
dmattek committed
48
## Input file
dmattek's avatar
dmattek committed
49

Jacques Marc-Antoine's avatar
Jacques Marc-Antoine committed
50
The app recognizes CSV (comma-separated values) files where data columns are separated by a comma and floating point numbers use a dot (full-stop). Compressed CSV files in zip or bz2 format can be uploaded directly without decompression. Both long and wide data formats are accepted but we highly recommend using the long format because it allows for multiple groupings and multivariate measurements.
dmattek's avatar
dmattek committed
51

Jacques Marc-Antoine's avatar
Jacques Marc-Antoine committed
52 53
#### Long format
In the long format, the first row should include column headers. The input CSV file should contain at least these three columns:
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

* Identifier of a time series, i.e. a track label
* Time points
* Time-varying variable

| ID | Time | Meas1 |
|----|------|-------|
| 1  |  1   | 3.3   |
| 1  |  2   | 2.1   |
| 1  |  4   | 4.3   |
|----|------|-------|
| 2  |  1   | 2.8   |
| 2  |  2   | 1.9   |
| 2  |  3   | 1.7   |
| 2  |  4   | 2.2   |

Jacques Marc-Antoine's avatar
Jacques Marc-Antoine committed
70
In case of multivariate time series, additional columns with variables can be added in the input. Then, the GUI allows for choosing a single or a combination of two variables to display.
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

Time series can be grouped by introducing a grouping column:

| Group | ID | Time | Meas1 |
|-------|----|------|-------|
| gr1   | 1  |  1   | 3.3   |
| gr1   | 1  |  2   | 2.1   |
| gr1   | 1  |  4   | 4.3   |
|-------|----|------|-------|
| gr1   | 2  |  1   | 2.8   |
| gr1   | 2  |  2   | 1.9   |
| gr1   | 2  |  3   | 1.7   |
| gr1   | 2  |  4   | 2.2   |
|-------|----|------|-------|
| gr2   | 1  |  1   | 5.1   |
| gr2   | 1  |  2   | 5.4   |
| gr2   | 1  |  3   | 5.3   |

Introduction of grouping allows for the analysis and displaying data per group.

Jacques Marc-Antoine's avatar
Jacques Marc-Antoine committed
91 92 93 94 95 96 97 98 99 100 101 102 103
#### Wide format
In wide format, entire univariate time series are stored as rows, with columns treated as time points. The first two columns should contain a grouping and the identifier of time series.

| Group | ID | 0   | 1   | 2   | further time points |
|-------|----|-----|-----|-----|---------------------|
| gr1   | 1  | 3.0 | 3.3 | 3.1 | ...                 |
| gr1   | 2  | 2.0 | 2.1 | 1.9 | ...                 |
| gr2   | 1  | 4.9 | 5.1 | 5.0 | ...                 |
| gr2   | 2  | 5.2 | 5.4 | 5.3 | ...                 |
| gr2   | 3  | 5.5 | 5.3 | 5.6 | ...                 |

We do not recommend this format because of its lack of flexibility. In wide format, only one grouping column and one measurement can be passed at a time, this means any new grouping or measurement analysis requires to create a dedicated file.

Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
104 105 106 107 108 109 110 111
## Unique track IDs

For the analysis, track labels need to be unique across the entire dataset. If the track label column is not unique in the uploaded dataset, there's an option in the UI to create a unique track ID. Check the *Create unique track label* box on and choose grouping columns that will be added to the existing non-unique track label. 

In the example above, the `ID` column is not unique across the dataset (ID=1 is repeated in group `gr1` and `gr2`), therefore the unique track label has to consist of columns `Group` and `ID`. The resulting track label will be `gr1_1`, `gr2_1`, etc.


## Modules
dmattek's avatar
dmattek committed
112

Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
113
The app opens with a default window that allows to plot population averages, individual time series, and power spectral density. 
dmattek's avatar
dmattek committed
114

Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
115
The following features of time series analysis are available in the app:
dmattek's avatar
dmattek committed
116

Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
117 118 119 120 121 122
- Perform simple **math calculations** on an individual variable (inversion 1/X), or on two variables (division, sum, multiplication, subtraction).
- **Trim** the time axis of the data.
- **Normalise** to the average of data points in a selected interval. Time series can be normalised with respect to the entire dataset, a group, or a single time series. The latter would normalise every time course to the mean of its own selected interval.
- **Remove outlier time points** by removing a percentage of data from the top, bottom, or both tails of pooled data points. Gaps in time series due to outlier removal can be then linearly interpolated or tracks can be removed entirely from the set. The UI allows for selecting the size of gaps above which the track is removed.
- **Highlight** individual time series by selecting a unique series identifier.
- Calculate area under individual time series and visualise as a dot-, violin-, or a box-plot. The UI allows for selection of the time series range used for **AUC** calculation.
Maciej Dobrzynski's avatar
Maciej Dobrzynski committed
123
- Display a dot-, violin-, box-, or a line-plot for selected time points.
Jacques Marc-Antoine's avatar
Jacques Marc-Antoine committed
124 125 126 127
- Display a scatter-plot to identify **correlations** between two time points.
- Calculate the **power spectral density (PSD)** using smoothed periodogram or autoregressive fit. Both estimations rely on the R's built-in implementation \texttt{spectrum}. PSD plots can be visualized in the frequency or period domain and independently for each time-series groups. Axis can be transformed with common functions (log, inverse...) to facilitate the identification of spectral patterns.
- Perform **hierarchical and sparse-hierarchical clustering** of individual time series. In these modules, the dendrogram can be cut at a chosen level to help visualising clusters. Addiitonally available are plots with cluster averages, individual times series per cluster, and contribution of time series from different groupings to clusters.
- Perform **cluster validation**. In this module both relative and internal validations are available. Relative validation with a sweep through a range of possible cluster numbers and a report of average silhouette width and within cluster sum of squares. Internal cluster validation, for a fixed number of clusters return 3 visualizations: a dendrogram colored according to the cut, the silhouette plot and a visualization of the clusters on the first 2 principal components. This analysis relies on the implementation in the R package \texttt{factoextra}.