Abstract: I will describe how probability measures may be embedded into an RKHS, and the pseudo-metric induced by these embeddings. This is shown to be a metric, and the embedding is injective, when the kernel is characteristic. I will give a number of conditions that can be used to prove a kernel has the characteristic property, with emphasis a simple Fourier argument. I will cover hypothesis testing, including a study of local departures from the null, and a discussion of optimal kernel choice. The kernel distances may be situated in relation to other metrics used in the fields of statistics and probability: I will describe how they relate to L2 distances between Parzen window estimates, integral probability metrics, and energy distances.
Wednesday (30 April): 4:00 pm -- 6:00 pm (903 SSW)
...
RKHS embeddings {ArthurPresent_Lecture3.pdf}
Abstract: I will approach dependence via two routes: as the distance between embeddings of a joint probability and the product of the marginals, and in terms of a covariance operator between mappings of the random variables to reproducing kernel Hilbert spaces. The latter gives a simpler interpretation, and relates more clearly to classical measures of dependence. As with energy distances, the Hilbert-Schmidt norm of the covariance operator can be interpreted as a distance covariance. A more powerful test of dependence can be obtained by replacing covariance operators with correlation operators, giving as one measure the kernel canonical correlation. Interestingly, correlation operators can be used to obtain an estimate of the chi-squared statistic which is asymptotically independent of the RKHS. Time permitting, I may cover some more advanced topics of work published in the last year (e.g. detecting interactions between triplets of variables, testing for dependence between time series,...)
Gabor Szekely -- Brownian Distance Covariance and Energy Statistics {Szekely Columbia Workshop.ppt}
...
Short break
04:00pm -- 05:30pm Session III
...
Arthur Gretton {ArthurTalk.pdf}
04:45--05:30pm Shaw-Hwa Lo {ShawHwaTalk.pptx}
End of term party and dinner
Lecture 3: Dependence measures using RKHS embeddings
Abstract: I will approach dependence via two routes: as the distance between embeddings of a joint probability and the product of the marginals, and in terms of a covariance operator between mappings of the random variables to reproducing kernel Hilbert spaces. The latter gives a simpler interpretation, and relates more clearly to classical measures of dependence. As with energy distances, the Hilbert-Schmidt norm of the covariance operator can be interpreted as a distance covariance. A more powerful test of dependence can be obtained by replacing covariance operators with correlation operators, giving as one measure the kernel canonical correlation. Interestingly, correlation operators can be used to obtain an estimate of the chi-squared statistic which is asymptotically independent of the RKHS. Time permitting, I may cover some more advanced topics of work published in the last year (e.g. detecting interactions between triplets of variables, testing for dependence between time series,...)
...
Energy Statistics {Szekely Columbia Workshop.ppt}
Monday (28 April): 1:30 pm -- 3:00 pm (903 SSW)
Lecture 1. Distance Correlation
...
09:00am -- 09:30am Opening remarks and breakfast
09:30am -- 11:45am Session I
...
Gabor Szekely {GaborTalk.ppt} {Szekely Columbia Colloquium.ppt}
10:15--11:00am Andrey Feuerverger {AndreyTalk.pdf}
11:00--11:45am Michael Kosorok
Michael Kosorok {MichaelTalk.pdf}
01:15pm -- 03:30pm Session II: Junior Researchers Session
01:15--01:45pm Jingyi (Jessica) Li {JessicaTalk.pdf}
01:45--02:15pm Bharath Sriperumbudur {Bharath.pdf}
Short break
...
Subhadeep Mukhopadhay
03:00--03:30pm {DeepTalk.pdf}
03:00--03:30pm David Reshef and Yakir Reshef {DavidTalk.pdf}
Short break
04:00pm -- 05:30pm Session III
...
Arthur Gretton
04:45--05:30pm
04:45--05:30pm Shaw-Hwa Lo {ShawHwaTalk.pptx}
End of term party and dinner
Titles and Abstract:
Arthur Gretton: Kernel tests of homogeneity, independence, and multi-variable interaction
...
test statistics (eg(e.g., for independence,
...
and on howtheyhow they can be
Michael Kosorok: Using Brownian Distance Covariance in Semi-nonparametric Inference
Abstract: In this work, we propose two flexible procedures which use Brownian distance covariance for semi-nonparametric hypothesis testing. The first procedure tests the general hypothesis of whether a certain set of covariates is associated with a right censored failure time. The general procedure requires only weak assumptions and does not require estimation of the censoring probability. The second procedure tests adequacy of a semi-nonparametric model in the context of smoothing spline ANOVA (SS-ANOVA). Specifically, the test evaluates whether a given SS-ANOVA model with p variables with main effects and a predefined set of interactions is sufficient or if more terms are needed. The procedure can also test whether any interactions are needed at all. For both procedures, we use model-based permutation and bootstrap approaches to obtain critical values. Theory and simulation studies verify that both procedures preserve type-I error and have good power performance.
Jingyi(Jessica) Li: A New Statistical Measure for Identifying Sparse Non-functional Relationships between Pairwise Variables
...
(e.g. genes) thatexhibitthat exhibit specific relationships
...
which can idenfityidentify linear relationships,
...
a new statistcalstatistical measure for indentifyingidentifying certain types
...
of conditional expectionexpectation and can
Shaw-Hwa Lo: Discovering Influential Variables followed by Interaction-based learning: A Partition Retention (PR) Approach
Abstract: We consider a computer intensive approach (PR, 09), based on an earlier method (Lo and Zheng (2002)) for detecting which, of many potential explanatory variables, have an influence on a dependent variable Y. This approach is suited to detect influential variables in groups, where causal effects depend on the confluence of values of several variables. It has the advantage of avoiding a difficult direct analysis, involving possibly thousands of variables, guided by a measure of influence I. At this stage the objective is to discover those influential variables (in groups),we are confining our attention to locating a few needles in a haystack. After that, to deal with challenging real data applications, typically involving complex and extremely high dimensional data, we shall introduce an interaction-based feature selection and a prediction procedure, using a breast cancer gene expression data as an illustrated example. The quality of variables selected is evaluated in two ways: first by classification error rates, then by functional relevance using external biological knowledge. We demonstrate that (1) the classification error rates can be significantly reduced; (2) incorporating interaction information into data analysis can be very rewarding in generating novel scientific findings and models. If time permits, a heuristic explanation why and when the proposed methods may lead to such a dramatic (classification/ predictive) gain is discussed.