Active and Dynamic Approaches for Clustering Time Dependent Information: Lag Target Time Series Clustering and Multi-Factor Time Series Clustering

Doo Young Kim; Chris P. Tsokos

doi:10.2991/jsta.2018.17.3.5

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Volume 17, Issue 3, September 2018, Pages 462 - 477

Active and Dynamic Approaches for Clustering Time Dependent Information: Lag Target Time Series Clustering and Multi-Factor Time Series Clustering

Authors

Doo Young Kimdkim@shsu.edu

Department of Mathematics and Statistics, Sam Houston State University Box 2206, Huntsville, TX 77341-2206, USA

Chris P. Tsokosctsokos@usf.edu

Department of Mathematics and Statistics, University of South Florida 4202 East Fowler ave, CMC 342, Tampa, FL 33620, USA

Received 22 August 2017, Accepted 17 October 2017, Available Online 30 September 2018.

DOI: 10.2991/jsta.2018.17.3.5 How to use a DOI?
Keywords: Time Dependent Information; Clustering; Mahalanobis Distance
Abstract: One of data mining schemes in statistics is clustering panel data such as longitudinal data and time series data. Classical approaches to cluster such time dependent information do not properly count time dependencies among objects we are interested to analyze. In the present study, we propose an approach which takes time dependencies into our consideration by introducing appropriate weight factors with an add-on approach which allows us to measure pairwise distances in multi-dimensional space not just in two dimension. We refer to these approaches LTTC (Lag Target Time Series Clustering) and MFTC (Multi-Factor Time Series Clustering), respectively. These proposed methods in the present study are applicable to any time dependent information from various research areas, and we have applied these methods to state level brain cancer mortality rates in the United States that illustrates the importance of subject methods.
Copyright: © 2018, the Authors. Published by Atlantis Press.
Open Access: This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

1. Introduction

We are living in the world with a flood of information which changes over time, and this time dependent information occupies the main part of BIG DATA that is the current prime topic in data science. There have been several statistical approaches [1] [2] [3] [9] [10] [11] [12] [13] [14] [15] [16] to extract the significant core from time dependent information, and in the present study, we propose new methods to obtain the important essence from the time dependent information by clustering time dependent responses such as time series data and longitudinal data we are commonly faced with to analyze. Figure 1, below describes time dependent information we deal with in Statistics and we focus on time series data and a part of longitudinal data in the present study.

Classical methods in clustering time dependent information were a sort of a passive approach from a data scientist’s viewpoint, because resulting clusters followed by these methods are deterministic based on the measure of dissimilarity no matter what distance measurements we applied to the data. However, the new methods we are proposing in the present study are active processes to deliver the core information from the massive information we are facing to be analyzed based on our objective of the present study.

In general, we have three different clustering approaches for time dependent information as shown in Figure 1, that is,

①
Temporal-Proximity-Based Clustering Approach.
②
Representation-Based Clustering Approach.
③
Model-Based Clustering Approach.

Our proposed methods are developed in order to accommodate and improve problems inherited from imposing several assumptions in temporal-proximity-based clustering approach. In temporal-proximity-based approach, we assume that there is plenty of information available in each time series object, and only one stream of information is given as a function of time.

But, what if we do not have enough number of observations to use classical time series clustering methods, and what if there exist several significant streams of information in each time series object? Thus, we proceed to introduce two new clustering methods to cover these important cases in temporal-proximity-based approach. Moreover, those classical time series clustering methods do not count actual time dependencies among time series objects and the resulting clusters are usually based on trends and patterns. Hence, we are not able to investigate their actual degree of time dependencies if we use classical time series clustering methods.

2. Motivation

In what follows we discuss the new methods we propose.

2.1. Lag Target Time Series Clustering

The first approach we propose in the current study is “Lag Target Time Series Clustering (LTTC)”. In time series analysis, we usually consider more than 50 observations in each time series objects (responses) as possibly enough information, but this condition is not always satisfied in the real world problem. However, if we take cross lag distances into consideration, we can increase the number of distance measurements considerably.

In Figure 2, below, X_t is the baseline time series object, Y_t is a vertical shifted time series object of X_t, and Z_t is a preceding index of X_t. Now, which information is more closely related to the baseline time series object, X_t? If we ignore lag-time-dependency between two time series objects, we have

$d(Xt,Yt)<<<d(Xt,Zt),$

no matter what distance measure method we use. However, if we measure cross lag-one distance between two time series objects, we obtain

$d(Xt-1,Yt)>>>d(Xt-1,Zt).$

Now, suppose we have two different clusters, one with Y_t and the other with Z_t, then, does X_t go with the cluster with Y_t? or Z_t? We definitely need to include all three time series objects in the same cluster and we will be able to obtain this desirable resulting cluster using our proposed method, LTTC.

2.2. Multi-Factor Time Series Clustering

The second method we propose in the present study is “Multi-Factor Time Series Clustering (MFTC)”. This method (MFTC) is more meaningful as a more realistic approach to our previously proposed method, LTTC. As we already mentioned in the introduction, one of the general assumptions in classical temporal-proximity-based time series clustering is that there exists only one stream of information in each time series objects. However, usually each time series response consists of several sub-information. For example, daily stock price consists of several sub-information such as opening price, closing price, maximum price, and minimum price, etc. If each sub-information shows different behavior and has a significant impact on the original information, we should take these differences in consideration (sub-information) into our modeling. Also, in health science, survival analysis of patients is a function of time and death is caused by several factors, for example in lung cancer, death was due to smoking, overweight, age, drinking, etc. Thus, we must take these risk factors into consideration in modeling survival analysis. Therefore, when we measure the distance between two time series objects, we now put our ruler in the multi-dimensional space and the degree of dimension is always “the number of factors considered in the study plus one”, because of the time factor. If we just measure cross lag zero distance, it is very trivial as shown in Figure 3. However, when we measure cross lag distances as shown in Figure 4, we have to consider the unit difference between time and other factors and a weight factor which presented in the later section replaces time unit.

3. An Application of LTTC and MFTC: Brain Cancer Mortality Rates in the United States

In what follows we will apply our methods in some important real data.

3.1. Objective of the Study

There have been various mortality rates statistical models of brain cancer for the entire United States, [4], [5], and [6]. However, we do not have any study done for various regional differences of the brain cancer mortality rates in the United States. We strongly believe that there are signifi-cant regional differences, primarily due to environmental issues such as carbon dioxide emission, the quality of drinking water, etc. that cause death of brain cancer. Thus, our proposed method of analytic clustering procedure based on regional brain cancer mortality rates in the United States is very important.

3.2. Structure of the Data

The data that we are using was collected by the Surveillance, Epidemiology, and End Results (SEER) database which is one of the biggest epidemiological databases in the U.S. and contain U.S. state level mortality rates due to brain cancer from 1969 to 2012. Figure 5, below, shows the structure of the data, with 9 climate regions, 51 states including D.C., and calculated mortality rates for males and females separately. In each state, m_t and f_t represent the the number of deaths per 100,000 population due to brain cancer at time t( = 1, 2,..., 43) for males and females, respectively.

Table 1, below, displays p-values from nonparametric Kruskal-Wallis tests for the hypothesis that the median level of the brain cancer mortality rates of male and female are same in each state of the United States, and calculated p-values in Table 1 suggest for us to consider MFTC method to achieve the objective of the study. [7] [17] [18] For example, the largest p-value we have found in Table 1 is 0.034 for the state of North Dakota and still this p-value is reasonably small enough to decide that the differences between male brain cancer mortality rates and female brain cancer mortality rates are statistically significant, when we set the level of significance, α, at 0.05.

State	p-value	State	p-value	State	p-value
IL	1.56E-14	NH	6.40E-06	FL	4.85E-14
IN	1.30E-09	NJ	9.38E-14	GA	4.42E-13
KY	1.83E-08	NY	1.73E-15	NC	9.19E-08
MO	6.05E-11	PA	2.60E-08	SC	1.36E-08
OH	6.45E-15	RI	1.32E-05	VA	3.01E-11
TN	7.21E-12	VT	2.10E-05	AZ	7.56E-10
WV	1.94E-04	AK	6.78E-03	CO	5.40E-09
IA	2.14E-07	ID	2.27E-05	NM	4.92E-08
MI	3.59E-11	OR	7.62E-11	UT	3.74E-06
MN	2.82E-13	WA	7.21E-14	CA	1.41E-15
WI	6.79E-12	AR	5.21E-06	HI	4.48E-05
CT	1.24E-11	KS	1.60E-10	NV	3.56E-09
DE	7.92E-04	LA	7.97E-08	MT	3.10E-07
DC	7.14E-03	MS	1.28E-04	NE	1.34E-07
ME	5.58E-07	OK	3.70E-10	ND	3.40E-02
MD	2.24E-10	TX	7.62E-11	SD	6.78E-03
MA	6.79E-11	AL	2.65E-10	WY	7.32E-03

Table 1:

Comparison Between Male and Female Brain Cancer Mortality Rates.

4. Construction of the Dissimilarity Matrix

Statistical clustering procedures are performed based on the dissimilarity matrix, which is a set of pairwise distances among time series responses. Based on the structure of the data as shown by Figure 5 and using the proposed method MFTC as presented in Table 1, we define pairwise distances as follows.

4.1. Distance at the Cross Lag Zero

First, we define pairwise distances among mortality rates in all U.S. at the cross lag zero. Let

$Ri=[mi1fi1mi2fi2⋮⋮miTfiT]$

and

$Rj=[mj1fj1mj2fj2⋮⋮mjTfjT]$

be the brain cancer mortality rates in state i and state j, respectively, and define a difference matrix,

$D=Ri-Rj=[mi1-mj1fi1-fj1mi2-mj2fi2-fj2⋮⋮miT-mjTfiT-fjT]=[dm1df1dm2df2⋮⋮dmTdfT].$

Then the distance between state i and state j at cross lag zero is given by

(4.1)

$dij=∑t=1TDtS-1Dt′⋅Wt,$

where D_t is t^th row of the difference matrix D, S is COV(D_m, D_f ), and W_t is a weight factor, which is the ratio of the absolute value of the sample autocorrelation, and is defined as,

$Wt=12T(|M|+|F|)∑t=1T(|M|+|F|),$

where

$M=∑τ=1t(dm,τ+T-t-d¯m)(dm,τ-d¯m)$

and

$F=∑τ=1t(df,τ+T-t-d¯f)(df,τ-d¯f).$

Equation (4.1) is basically a weighted Mahalanobis distance, and our distance measures are built upon the Mahalanobis distance because the inverse covariance factor stabilizes the overall distance matrix, thus, the effect of the weight factor is minimized and not over-counted, [8] [19] [20].

4.2. Distance at the Cross Lag k (k ≥ 1)

We now define _kR_i, the brain cancer mortality rates in state i after eliminating k rows from the front, and R_j,k, the brain cancer mortality rates in state j after removing k rows from the tail.

$Rki=[mi,k+1fi,k+1mi,k+2fi,k+2⋮⋮mi,Tfi,T]$

and

$Rj,k=[mj,1fj,1⋮⋮mj,T-1-kfj,T-1-kmj,T-kfj,T-k],$

where m_i,k and f_i,k denote the male brain cancer mortality rate at time k and the female brain cancer mortality rate at time k for the state i, respectively, and accordingly the backward difference and the forward difference matrices can be obtained as given below.

(4.2)

$Dk=Rki-Rj,k=[mi,k+1-mj,1fi,k+1-fj,1mi,k+2-mj,2fi,k+2-fj,2⋮⋮mi,T-1-mj,T-1-kfi,T-1-fj,T-1-kmi,T-mj,T-kfi,T-fj,T-k]=[dkm,1dkf,1dkm,2dkf,2⋮⋮dkm,T-1-kdkf,T-1-kdkm,T-kdkf,T-k]$

and

(4.3)

$Dk=Ri,k-kRj=[mi,1-mj,k+1fi,1-fj,k+1mi,2-mj,k+2fi,2-fj,k+2⋮⋮mi,T-1-k-mj,T-1fi,T-1-k-fj,T-1mi,T-k-mj,Tfi,T-k-fj,T]=[dm,k,1df,k,1dm,k,2df,k,2⋮⋮dm,k,T-1-kdf,k,T-1-kdm,k,T-kdf,k,T-k]$

Based on equation (4.2) and (4.3), we can establish the cross lag k distance between state i and state j as a mean of weighted backward Mahalanobis distance and weighted forward Mahalanobis distance as given by the equation (4.4), below.

(4.4)

$dij,k=12(∑t=1T-kDkt Sk-1Dkt'⋅Wkt+∑t=1T-kDt,kSk-1Dt,k'⋅Wt,k),$

where two weight factors,_kW_t and W_t,k, are defined below for k = 0,1,2,...,T − 3.

$Wkt=12(T-k)(|M1|+|F1|)∑t=1T-k(|M1|+|F1|),$

where

$M1=∑τ=1t(dkm,τ+T-k-t-d¯km)(dkm,τ-d¯km)$

and

$F1=∑τ=1t(dkf,τ+T-k-t-d¯kf)(dkf,τ-d¯kf),$

and

$Wt,k=12(T-k)(|M2|+|F2|)∑t=1T-k(|M2|+|F2|),$

where

$M2=∑τ=1t(dm,k,τ+T-k-t-d¯m,k)(dm,k,τ-d¯m,k)$

and

$F2=∑τ=1t(df,k,τ+T-k-t-d¯f,k)(df,k,τ-d¯f,k).$

4.3. The Dissimilarity Matrix for Clustering

Using the distances we have defined above, we proceed to obtain l layers of the distance matrices as shown in Figure 6, below. In each cross lag distance matrix in Figure 6, d_{i j,k} represents the weighted mahalanobis distance between state i and state j at cross lag k.

In order to complete our final dissimilarity matrix for the clustering procedure, we define a weight factor for each layer, which is the ratio of the absolute value of the sample cross-correlation as shown by equation (4.5), and the resulting structure of the weight matrices as displayed in Figure 7. These weight factors take the difference between genders and time dependency between two objects into consideration at the same time properly. That is,

(4.5)

$δi j,k=12T(|M3|+|F3|)∑k=0T-3(|M3|+|F3|)$

where

$M3=∑τ=1T-k(mi,τ+k-m¯i)(mj,τ-m¯j)$

and

$F3=∑τ=1T-k(fi,τ+k-f¯i)(fi,τ-f¯j).$

In each layer in Figure 7, δ_{i j,k} denotes the weight for d_{i j,k} in Figure 6, that is the weight factor applying to the distance between state i and state j at cross lag k.

Now, we proceed to multiply the distance layers in Figure 6 with the weight layers in Figure 7, and add all the resulting layers to build our final dissimilarity matrix presented in Figure 8 to perform the statistical clustering procedure. At this stage, our main interest lies on the selection of the optimal level of lag distance, and our final dissimilarity matrix is very sensitive to the choice of the optimal level of lag, k.

In Figure 8, d_{i j} is the final similarity or dissimilarity index between state i and state j. In other words, the sum of weighted cross lag distances between state i and state j.

5. Clustering Procedure

We utilize Ward’s Clustering Method in this section to achieve our resulting clusters. Joe H. Ward, Jr., [21] [22] [23], proposed a general agglomerative hierarchical clustering procedure which is based on minimum variance criterion and it is also called ”Ward’s Minimum Variance Method”. In other words, our final clusters are obtained by minimizing within-cluster variance which is defined by the squared Euclidean distances among clustering objects as shown in equation (5.1), below.

(5.1)

$dij=d({Xi},{Xj})=‖Xi-Xj‖2.$

5.1. Clusters Based on Euclidean Distance vs. Mahalanobis Distance

Before we move into our main clustering problem of the brain cancer mortality rates in the U.S., we want to compare the clustering results between Euclidean distance and Mahalanobis distance. Figure 9, presents clustering maps based on Euclidean distance and Mahalanobis distance with the same weight factors described in previous sections. We have four-cluster solution in both clustering maps, and they are almost identical. Only two states stay in different clusters in both maps, and they are Washington state and New Hampshire state. This implies that the covariance between males and females are not significantly large, but this is still statistically significant because the covariance stabilizes the pairwise distances so that we have appropriate level of the effect from using weight factors.

5.2. Passive Deterministic Clustering vs. Active Dynamic Clustering

The map at the bottom in Figure 9 delivers the resulting clusters based on our definition of distances from equation (4.1). States in the green cluster are mostly located in the south region of the U.S., and other colored clusters are also determined by the dissimilarity matrix with lag zero which we obtained from the previous sections. With this approach, once we have a dissimilarity matrix where the clustering solution is only determined by the clustering method we want to choose. We refer to this classical approach as “Passive Deterministic Clustering” in this sense.

The algorithm of LTTC is presented in Figure 10, and this procedure is an active and dynamic way to cluster time series responses, because the final cluster solution is the end objective of the present study. Using this method, we first choose our target cluster which consists of time series objects that have similar characteristics, then perform a clustering procedure iteratively by including one more cross lag distance each time until we achieve our target cluster. When we obtain our target cluster, we continue using this procedure again until our target cluster breaks up. If our target cluster breaks up with a dissimilarity matrix with cross lag k distance, our solution to the subject problem is k − 1 lag clustering solution. From this solution, we can see the maximum degree of lag time dependency among time series objects in our target cluster, and minimum lag time dependency in other clusters.

5.3. Applying the Proposed Method

Now, we consider that the state of Texas and Florida have similar population characteristics and climate conditions; accordingly our objective of the study is finding the degree of lag time dependency between the two states. As shown in Figure 9, the two states are not in the same cluster when we ignore lag dependency among all of the U.S. states. Therefore, we add lag one distance each time before performing iterative clustering procedures, and then we obtain “Lag 3 Clusters” as our final solution of the subject problem as shown in Figure 11. This implies that brain cancer mortality rates between Florida and Texas have lag 3 time dependency and also we can find other states that have the same lag time dependency with two states as shown in Figure 11.

6. Conclusion

In the present study, we propose an active and dynamic method to cluster time dependent information. The application of MFTC and LTTC, is not confined to cluster ones the same kind of information but also to be able to investigate time dependent relationships among the information from various research areas.

We illustrated the usefulness of the proposed method by clustering an open problem of brain cancer mortality rates in USA. This information is quite important in investigating other risk factors on a regional bases, such as environmental issues that may influence brain cancer deaths.

The proposed active and dynamic procedure is applicable to cluster many important problems in finance, ecology, health sciences, among others. In the present study, we illustrated the effectiveness of the proposed method (procedure) in clustering the brain cancer mortality rates in the USA. Having this information, one can investigate what other effects such as CO₂ in the atmosphere, quality of water, etc., may contribute to brain cancer mortality. This procedure can also be applied to cluster breast cancer, lung cancer, prostate cancer, etc.

In finance, clustering the signals (price of a given stock as a function of time) for a given business segment, such as the health industry that consists of a member of stocks is quite important for investing effectively in the subject sector. Using the LTTC and MFTC methods can obtain very important information to portfolio managers for strategic changes in their investment objectives.

References

[1]T Warren Liao, Clustering of time series data: A survey, Pattern Recognition, Vol. 38, No. 11, 2005, pp. 1857-1874.

[2]Y Xiong, Mixtures of ARMA models for model-based time series clustering, Data Mining, ICDM proceedings, 2002, pp. 717-720.

[3]X Wang and R Hyndman, Characteristic-Based Clustering for Time Series Data, Data Mining and Knowledge Discovery, Vol. 13, 2006, pp. 335-364.

[4]S Deorah, CF Lynch, ZA Sibenaller, and TC Ryken, Trends in brain cancer incidence and survival in the United States: Surveillance, Epidemiology, and End Results Program, 1973 to 2001, Neurosurgical Focus, Vol. 20, No. 4, 2006, pp. E1.

[5]TA Dolecek, JM Propp, NE Stroup, and C Kruchko, CBTRUS Statistical Report: Primary Brain and Central Nervous System Tumors Diagnosed in the United States in 2005?2009, Neuro-Oncology, Vol. 14, No. 5, 2012, pp. v1-v49.

[6]MA Smith, B Freidlin, LAG Ries, and R Simon, Trends in Reported Incidence of Primary Malignant Brain Tumors in Children in the United States, Journal of the National Cancer Institute, Vol. 90, No. 17, 1998, pp. 1269-1277.

[7]WH Kruskal and WA Wallis, Use of ranks in one-criterion variance analysis, Journal of the American Statistical Association, Vol. 47, No. 260, 1952, pp. 583-621.

[8]PC Mahalanobis, On the generalized distance in statistics, Proceedings of the National Institute of Sciences (Calcutta), Vol. 2, 1936, pp. 49-55.

[9]C Goutte, P Toft, E Rostrup, F Nielsen, and L Hansen, On Clustering fMRI Time Series, NeuroImage, Vol. 9, No. 3, 1999, pp. 298-310.

[10]EJ Keogh and MJ Pazzani, An enhanced Representation of Time Series which Allows Fast and Accurate Classification, Clustering and Relevance Feedback, KDD-98 Proceedings, 1998, pp. 239-278.

[11]M Corduas and D Piccolo, Time Series Clustering and Classification by the Autoregressive Metric, Computational Statistics & Data Analysis, Vol. 52, No. 4, 2008, pp. 1860-1872.

[12]Y Xiong and D Yeung, Time Series Clustering with ARMA Mixtures, Pattern Recognition, Vol. 37, No. 8, 2004, pp. 1675-1689.

[13]K Kalpakis, D Gada, and V Puttagunta, Distance Measures for Effective Clustering of ARIMA Time Series, Data Mining, (ICDM2001), 2001, pp. 273-280.

[14]AM Alonso, JR Berrendero, A Hernandez, and A Justel, Time Series Clustering Based on Forecast Densities, Computational Statistics & Data Analysis, Vol. 51, No. 2, 2006, pp. 762-776.

[15]Y Kakizawa, RH Shumway, and M Taniguchi, Discrimination and Clustering for Multivariate Time Series, Journal of the American Statistical Association, Vol. 93, No. 441, 1998, pp. 328-340.

[16]D Jiang, J Pei, and A Zhang, DHC: A Density-based Hierarchical Clustering Method for Time Series Gene Expression Data, Bioinformatics and Bioengineering, Proceedings, 2003, pp. 393-400.

[17]N Breslow, A generalized Kruskal-Wallis test for comparing K samples subject to unequal patterns of censorship, Biometrika, Vol. 57, No. 3, 1970, pp. 579-594.

[18]E Theodorsson-Norheim, Kruskal-Wallis Test: BASIC Computer Program to Perform Nonparametric One-way Analysis of Variance and Multiple Comparisons on Ranks of Several Independent Samples, Computer Methods and Programs in Biomedicine, Vol. 23, No. 1, 1986, pp. 57-62.

[19]R De Maesschalck, D Jouan-Rimbaud, and DL Massart, The Mahalanobis Distance, Chemometrics and Intelligent Laboratory Systems, Vol. 50, No. 1, 2000, pp. 1-18.

[20]S Hayashi, Y Tanaka, and E Kodama, A New Manufacturing Control System Using Mahalanobis Distance for Maximising Productivity, Semiconductor Manufacturing Symposium, 2001, pp. 59-62.

[21]GJ Szekely, Hierarchical clustering via joint between-within distances: Extending ward’s minimum variance method, Journal of Classification, Vol. 22, 2005, pp. 151-183.

[22]A El-Hamdouchi and P Willett, Hierarchic Document Classification Using Ward’s Clustering Method, in Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval (1986), pp. 149-156.

[23]C Hervada-Sala and E Jarauta-Bragulat, Program to Perform Ward’s Clustering Method on Several Regionalized Variables, Computers & Geosciences, Vol. 30, No. 8, 2004, pp. 881-886.

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Journal: Journal of Statistical Theory and Applications
Volume-Issue: 17 - 3
Pages: 462 - 477
Publication Date: 2018/09/30
ISSN (Online): 2214-1766
ISSN (Print): 1538-7887
DOI: 10.2991/jsta.2018.17.3.5 How to use a DOI?
Open Access: This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Doo Young Kim
AU  - Chris P. Tsokos
PY  - 2018
DA  - 2018/09/30
TI  - Active and Dynamic Approaches for Clustering Time Dependent Information: Lag Target Time Series Clustering and Multi-Factor Time Series Clustering
JO  - Journal of Statistical Theory and Applications
SP  - 462
EP  - 477
VL  - 17
IS  - 3
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.2018.17.3.5
DO  - 10.2991/jsta.2018.17.3.5
ID  - Kim2018
ER  -

download .riscopy to clipboard