DSPA Chapter 16 (Variable Selection)

views comments

DSPA Chapter 16 (Variable Selection)

As we mentioned in Chapter 15, variable selection is very important when dealing with bioinformatics, healthcare, and biomedical data where we may have more features than observations. Variable selection, or feature selection, can help us focus only on the core important information contained in the observations, instead of every piece of information. Due to presence of intrinsic and extrinsic noise, the volume and complexity of big health data, and different methodological and technological challenges, this process of identifying the salient features may resemble finding a needle in a haystack. Here, we will illustrate alternative strategies for feature selection using filtering (e.g., correlation-based feature selection), wrapping (e.g., recursive feature elimination), and embedding (e.g., variable importance via random forest classification) techniques.

…Read more Less…

Tags

DSPA Chapter 16 (Variable Selection)

Related Media