Title: | Refined Modified Stahel-Donoho (MSD) Estimators for Outlier Detection (Parallel Version) |
---|---|
Description: | A parallel function for multivariate outlier detection named modified Stahel-Donoho estimators is contained in this package. The function RMSDp() is for elliptically distributed datasets and recognizes outliers based on Mahalanobis distance. This function is for higher dimensional datasets that cannot be handled by a single core function RMSD() included in 'RMSD' package. See Wada and Tsubaki (2013) <doi:10.1109/CLOUDCOM-ASIA.2013.86> for the detail of the algorithm. |
Authors: | Kazumi Wada [aut, cre] |
Maintainer: | Kazumi Wada <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.1 |
Built: | 2025-03-08 04:00:39 UTC |
Source: | https://github.com/kazwd2008/rmsdp |
This function is for multivariate outlier detection. version 0.0.1 2013/06/15 Related paper: DOI: 10.1109/CLOUDCOM-ASIA.2013.86 version 0.0.2 2021/11/15 Outlier detection step added version 0.0.3 2022/08/12 Bug fixed about Random seed setting
RMSDp(inp, cores = 0, nb = 0, sd = 0, pt = 0.999, dv = 10000)
RMSDp(inp, cores = 0, nb = 0, sd = 0, pt = 0.999, dv = 10000)
inp |
input data (a numeric matrix) |
cores |
number of cores used for this function |
nb |
number of basis |
sd |
seed (for reproducibility) |
pt |
threshold for outlier detection (probability) |
dv |
maximum number of elements processed together on the same core |
a list of the following information
u final mean vector
V final covariance matrix
wt final weights
mah squared squared Mahalanobis distances
cf threshold to detect outlier (percentile point)
ot outlier flag (1:normal observation, 2:outlier)
A subset of data from the World Health Organization Global Tuberculosis Report ...
wine
wine
## 'wine' A data frame with 178 rows and 13 columns:
Alcohol
Malic acid
Ash
Alcalinity of ash
Magnesium
Total phenols
Flavonoids
Nonflavanoid phenols
Proanthocyanins
Color intensity
Hue
OD280/OD315 of diluted wines
Proline
<https://archive.ics.uci.edu/dataset/109/wine>